Chris Weldon

Rants and musings on .Net, SharePoint, PHP, and consulting.

Getting "Yelled" At

Permalink

Okay, so when I got into Work this past Friday, I was greeted by an email in my inbox asking about my sending of email over the weekend. Here’s a nice little box that the administrator sent me:

1
2
3
4
Host/Domain Summary: Messages Received
 msg cnt   bytes   host/domain
 -------- -------  -----------
 125555   288377k  HOST.tamu.edu

Background:

Over the previous weekend I had established a really cool auto populating script that would help to automatically populate the Keystone test database with “hopefully” 300K+ emails so that I could have a nice amount of data to test Keystone with (as compared to our current test server with <800 slips in it - not a representative amount compared to production.

I’ll post the scripts here in a moment, but this email didn’t frustrate me. It was the email thereafter which did:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Hi Chris,
  While it may add variety, it really isn't an appropriate use of the
University relays.  Please don't do this anymore.  You could have also
gotten variety by using aliases at a directly routeable host.  Thank you!

--Tom

Christopher Weldon wrote:
> > Tom wrote:
>> >> Hi Chris,
>> >>     You came up in this morning's log report as having sent over 125,000
>> >> messages through the relays yesterday.  Mind if I ask what's going on?
>> >> Thanks!
>> >>
>> >> --Tom
>> >>
>> >>
>> >> Host/Domain Summary: Messages Received
>> >>  msg cnt   bytes   host/domain
>> >>  -------- -------  -----------
>> >>  125555   288377k  HOST.tamu.edu
>> >>
> >
> > I'm populating my keystone test database. All messages are coming from
> > the system and going back to the system, but many of the automatic
> > replies are going to my various aliases such as 'neraath@tamu.edu',
> > 'linux@tamu.edu', which end up going back to the system. It adds a
> > little variety to the slips I'm creating.

Much to his dismay, I ended up using several “local” aliases so that mail wouldn’t be bounced off the relays. However, this weekend script (which was going to take 7 days to produce 300K+ emails) didn’t go as expected, because the database maxed out @ 4GB. So, once I remade the database with the new limit in the dozens of Terabytes range, I needed to populate it really quickly. Thus, I ran about 20 threads of the script offset in 1 second increments with no more 5 second delay between processing. This started to populate the database VERY quickly, but also ate system resources like no other!

So, in all, about 700K+ emails were generated from my host, and as can be seen, 125K made it to the TAMU relays. My opinion: Big deal. The relays are used for sending messages out, and any administrator would realize that a “properly configured” (something that never happens @ TAMU) mail relay will be able to handle TONS of traffic. Additionally, WAY more SPAM goes through the mail relays (in the millions per day) than that and at least I was using the relays to actually do work, as opposed to appeasing my own hobbies. Thus, this should have been considered an “acceptable use of the relays”. All in all, I don’t like being talked down to by an administrator that has enough problems on his own trying to keep the damn relays from failing.

But, I guess one thing that’s cool out of this entire thing: I beat out Facebook on the number of emails through the relay. :-D

Scripts

The way these scripts work is that it has an array of sources, from emails, and to emails to choose from. The “randomness” is really only pseudo-random numbers generated with PHP’s rand() function. The sources, are fortune, Local files, Cached News Stories from RSS Feeds, and a few other things.

mass_mailer_script.php
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
<?php
#----------------------------------------------------
# mass_mailer_script.php
# 
# Created by: Christopher Weldon
#
# Purpose: To use a variety of different "from" addresses
# and mass-mail Keystone to populate it's database in a quick manner.
#
#-------------------------------------------------------

ini_set('error_reporting', 2047);
ini_set('display_errors', 1);

$feedcache = array();

function getArticleFromFeeds() {
    global $feedcache;

    // Listing of Feeds I'd like to retrieve from
    $feeds = array(
                'http://planet.gentoo.org/rss20.xml',
                'http://slashdot.org/index.rss',
                'http://slashdot.org/apple.rss',
                'http://slashdot.org/linux.rss',
                'http://slashdot.org/science.rss',
                'http://slashdot.org/developers.rss',
                'http://slashdot.org/askslashdot.rss',
                'http://www.phparchitect.com/discuss/rdf.php?mode=m&l=1&basic=1',
                'http://hades.phparch.com/hermes/feednews/index.php',
                'http://www.nytimes.com/services/xml/rss/nyt/Business.xml',
                'http://www.nytimes.com/services/xml/rss/nyt/Arts.xml',
                'http://www.nytimes.com/services/xml/rss/nyt/Science.xml',
                'http://www.nytimes.com/services/xml/rss/nyt/Technology.xml',
                'http://www.nytimes.com/services/xml/rss/nyt/WeekinReview.xml',
                'http://rss.news.yahoo.com/rss/topstories'
    );

    // Figure out which one we want to use
    $chosen_feed = $feeds[rand(0,(sizeof($feeds)-1))];
    echo "Chosen feed is: ".$chosen_feed."\n";

    if (array_key_exists($chosen_feed, $feedcache) && sizeof($feedcache) > 0) {
        echo "Using the cached feed.\n";
        $article = $feedcache[$chosen_feed][rand(0,(sizeof($feedcache[$chosen_feed])-1))];
        return $article;
    } else {
        // Open the feed using fopen
        $feed = fopen($chosen_feed, "r");
        if (!$feed) {
            echo "There was a problem opening the feed!\n";
            exit();
        }

        include_once('XML/RSS.php');
        // download and parse RSS data
        $rss =& new XML_RSS($chosen_feed);
        $rss->parse();

        // print headlines
        // print_r($rss->getItems());
        $articles = $rss->getItems();
        unset($rss);
        fclose($feed);
        echo "Caching the feed.\n";
        $feedcache[$chosen_feed] = $articles;
        return $articles[rand(0,(sizeof($articles)-1))];
    }
}

function gen_email() {
    $from_addresses = array(
                        'Christopher Weldon &lt;neraath@tamu.edu&gt;',
                        'Chris Weldon &lt;linux@tamu.edu&gt;',
                        'Chris Test &lt;chris-test2@tamu.edu&gt;'
    ); /* This is normally a lot longer, but I cut it short as you should get the basic gist. */

    $to_addresses = array(
                        'HDC &lt;helpdesk@HOST&gt;',
                        'Keystone &lt;test-submit@HOST&gt;',
                        'Operations-Keystone &lt;test-submit-keystone@HOST&gt;'
    );

    $sources = array(
        'exec:/usr/bin/fortune',
        'feed:getArticleFromFeeds',
        'feed:getArticleFromFeeds',
        'feed:getArticleFromFeeds',
        'feed:getArticleFromFeeds',
        'feed:getArticleFromFeeds',
        'feed:getArticleFromFeeds',
        'file:/var/log/Xorg.0.log',
        'file:/var/log/keystone/webapp/alpha_keystone.log',
        'file:/var/log/keystone/webapp/alpha_php_error_log.log'
    );

    // First, figure out the source
    $source = $sources[rand(0,(sizeof($sources)-1))];
    list($type,$loc) = explode(':', $source);

    switch($type) {
        case 'file':
            $subject = 'Contents of '.$loc;
            $body = file_get_contents($loc, false);
            $must_be_to = 'test-noreply@HOST.tamu.edu';
            break;
        case 'feed':
            $ret_data = $loc();
            $subject = $ret_data['title'];
            $body = strip_tags($ret_data['description']);
            if (array_key_exists('dc:creator', $ret_data)) {
                $author = $ret_data['dc:creator'];
            }
            break;
        case 'exec':
            $subject = 'A new fortune for you on '.time();
            $body = shell_exec($loc);
            break;
        default:
            echo "Could not determine type!\n";
        return;
    }

    // Assemble the message
    if (isset($author)) {
        $from = $author.' &lt;linux@tamu.edu&gt;';
    } else if (isset($must_be_from)) {
        $from = $must_be_from;
    } else {
        $from = $from_addresses[rand(0,(sizeof($from_addresses)-1))];
    }

    $headers =      'From: '.$from."\r\n".
                    'X-Mailer: PHP/'.phpversion()." \r\n".
                    'X-Comment: Created from mass_mailer_script.php'."\r\n";

    if (isset($must_be_to)) {
        $to = $must_be_to;
    } else {
        $to = $to_addresses[rand(0,(sizeof($to_addresses)-1))];
    }

    echo "The following message: \n".
            "\tFrom: ".$from."\n".
            "\tTo: ".$to."\n".
            "\tSubject: ".$subject."\n";

    if (mail($to, $subject, $body, $headers)) {
        echo "was sent successfully!\n\n";
    } else {
        echo "was unsuccessful!\n\n";
    }

//      sleep(5);
}

// START HERE
for ($i = 0; $i < 30000; $i++) {
    gen_email();
}

?>
rss_feed_retrieval.php
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
<?php
function get_article_from_feeds() {
    // Listing of Feeds I'd like to retrieve from
    $feeds = array(
                    'http://planet.gentoo.org/rss20.xml',
                    'http://slashdot.org/index.rss',
                    'http://slashdot.org/apple.rss',
                    'http://slashdot.org/linux.rss',
                    'http://slashdot.org/science.rss',
                    'http://slashdot.org/developers.rss',
                    'http://slashdot.org/askslashdot.rss',
                    'http://www.phparchitect.com/discuss/rdf.php?mode=m&l=1&basic=1',
                    'http://hades.phparch.com/hermes/feednews/index.php',
                    'http://www.nytimes.com/services/xml/rss/nyt/Business.xml',
                    'http://www.nytimes.com/services/xml/rss/nyt/Arts.xml',
                    'http://www.nytimes.com/services/xml/rss/nyt/Science.xml',
                    'http://www.nytimes.com/services/xml/rss/nyt/Technology.xml',
                    'http://www.nytimes.com/services/xml/rss/nyt/WeekinReview.xml',
                    'http://rss.news.yahoo.com/rss/topstories'
            );

    // Figure out which one we want to use
    $chosen_feed = $feeds[rand(0,(sizeof($feeds)-1))];
    echo "Chosen feed is: ".$chosen_feed."\n";

    // Open the feed using fopen
    $feed = fopen($chosen_feed, "r");
    if (!$feed) {
            echo "There was a problem opening the feed!\n";
            exit();
    }

    include('XML/RSS.php');
    // download and parse RSS data
    $rss =& new XML_RSS($chosen_feed);
    $rss->parse();

    // print headlines
    // print_r($rss->getItems());
    $articles = $rss->getItems();
    return $articles[rand(0,(sizeof($articles)-1))];
}
?>

Comments