Getting "Yelled" At
Okay, so when I got into Work this past Friday, I was greeted by an email in my inbox asking about my sending of email over the weekend. Here’s a nice little box that the administrator sent me:Host/Domain Summary: Messages Received
msg cnt bytes host/domain
-------- ------- -----------
125555 288377k HOST.tamu.edu
Background:
Over the previous weekend I had established a really cool auto populating script that would help to automatically populate the Keystone test database with “hopefully” 300K+ emails so that I could have a nice amount of data to test Keystone with (as compared to our current test server with <800 slips in it - not a representative amount compared to production.
I’ll post the scripts here in a moment, but this email didn’t frustrate me. It was the email thereafter which did:Hi Chris,
While it may add variety, it really isn't an appropriate use of the
University relays. Please don't do this anymore. You could have also
gotten variety by using aliases at a directly routeable host. Thank you!
--Tom
Christopher Weldon wrote:
> > Tom wrote:
>> >> Hi Chris,
>> >> You came up in this morning's log report as having sent over 125,000
>> >> messages through the relays yesterday. Mind if I ask what's going on?
>> >> Thanks!
>> >>
>> >> --Tom
>> >>
>> >>
>> >> Host/Domain Summary: Messages Received
>> >> msg cnt bytes host/domain
>> >> -------- ------- -----------
>> >> 125555 288377k HOST.tamu.edu
>> >>
> >
> > I'm populating my keystone test database. All messages are coming from
> > the system and going back to the system, but many of the automatic
> > replies are going to my various aliases such as 'neraath@tamu.edu',
> > 'linux@tamu.edu', which end up going back to the system. It adds a
> > little variety to the slips I'm creating.
Much to his dismay, I ended up using several “local” aliases so that mail wouldn’t be bounced off the relays. However, this weekend script (which was going to take 7 days to produce 300K+ emails) didn’t go as expected, because the database maxed out @ 4GB. So, once I remade the database with the new limit in the dozens of Terabytes range, I needed to populate it really quickly. Thus, I ran about 20 threads of the script offset in 1 second increments with no more 5 second delay between processing. This started to populate the database VERY quickly, but also ate system resources like no other!
So, in all, about 700K+ emails were generated from my host, and as can be seen, 125K made it to the TAMU relays. My opinion: Big deal. The relays are used for sending messages out, and any administrator would realize that a “properly configured” (something that never happens @ TAMU) mail relay will be able to handle TONS of traffic. Additionally, WAY more SPAM goes through the mail relays (in the millions per day) than that and at least I was using the relays to actually do work, as opposed to appeasing my own hobbies. Thus, this should have been considered an “acceptable use of the relays”. All in all, I don’t like being talked down to by an administrator that has enough problems on his own trying to keep the damn relays from failing.
But, I guess one thing that’s cool out of this entire thing: I beat out Facebook on the number of emails through the relay. :-D
Scripts
The way these scripts work is that it has an array of sources, from emails, and to emails to choose from. The “randomness” is really only pseudo-random numbers generated with PHP’s rand() function. The sources, are fortune, Local files, Cached News Stories from RSS Feeds, and a few other things.<?php
#----------------------------------------------------
# mass_mailer_script.php
#
# Created by: Christopher Weldon
#
# Purpose: To use a variety of different "from" addresses
# and mass-mail Keystone to populate it's database in a quick manner.
#
#-------------------------------------------------------
ini_set('error_reporting', 2047);
ini_set('display_errors', 1);
$feedcache = array();
function getArticleFromFeeds() {
global $feedcache;
// Listing of Feeds I'd like to retrieve from
$feeds = array(
'http://planet.gentoo.org/rss20.xml',
'http://slashdot.org/index.rss',
'http://slashdot.org/apple.rss',
'http://slashdot.org/linux.rss',
'http://slashdot.org/science.rss',
'http://slashdot.org/developers.rss',
'http://slashdot.org/askslashdot.rss',
'http://www.phparchitect.com/discuss/rdf.php?mode=m&l=1&basic=1',
'http://hades.phparch.com/hermes/feednews/index.php',
'http://www.nytimes.com/services/xml/rss/nyt/Business.xml',
'http://www.nytimes.com/services/xml/rss/nyt/Arts.xml',
'http://www.nytimes.com/services/xml/rss/nyt/Science.xml',
'http://www.nytimes.com/services/xml/rss/nyt/Technology.xml',
'http://www.nytimes.com/services/xml/rss/nyt/WeekinReview.xml',
'http://rss.news.yahoo.com/rss/topstories'
);
// Figure out which one we want to use
$chosen_feed = $feeds[rand(0,(sizeof($feeds)-1))];
echo "Chosen feed is: ".$chosen_feed."\n";
if (array_key_exists($chosen_feed, $feedcache) && sizeof($feedcache) > 0) {
echo "Using the cached feed.\n";
$article = $feedcache[$chosen_feed][rand(0,(sizeof($feedcache[$chosen_feed])-1))];
return $article;
} else {
// Open the feed using fopen
$feed = fopen($chosen_feed, "r");
if (!$feed) {
echo "There was a problem opening the feed!\n";
exit();
}
include_once('XML/RSS.php');
// download and parse RSS data
$rss =& new XML_RSS($chosen_feed);
$rss->parse();
// print headlines
// print_r($rss->getItems());
$articles = $rss->getItems();
unset($rss);
fclose($feed);
echo "Caching the feed.\n";
$feedcache[$chosen_feed] = $articles;
return $articles[rand(0,(sizeof($articles)-1))];
}
}
function gen_email() {
$from_addresses = array(
'Christopher Weldon <neraath@tamu.edu>',
'Chris Weldon <linux@tamu.edu>',
'Chris Test <chris-test2@tamu.edu>'
); /* This is normally a lot longer, but I cut it short as you should get the basic gist. */
$to_addresses = array(
'HDC <helpdesk@HOST>',
'Keystone <test-submit@HOST>',
'Operations-Keystone <test-submit-keystone@HOST>'
);
$sources = array(
'exec:/usr/bin/fortune',
'feed:getArticleFromFeeds',
'feed:getArticleFromFeeds',
'feed:getArticleFromFeeds',
'feed:getArticleFromFeeds',
'feed:getArticleFromFeeds',
'feed:getArticleFromFeeds',
'file:/var/log/Xorg.0.log',
'file:/var/log/keystone/webapp/alpha_keystone.log',
'file:/var/log/keystone/webapp/alpha_php_error_log.log'
);
// First, figure out the source
$source = $sources[rand(0,(sizeof($sources)-1))];
list($type,$loc) = explode(':', $source);
switch($type) {
case 'file':
$subject = 'Contents of '.$loc;
$body = file_get_contents($loc, false);
$must_be_to = 'test-noreply@HOST.tamu.edu';
break;
case 'feed':
$ret_data = $loc();
$subject = $ret_data['title'];
$body = strip_tags($ret_data['description']);
if (array_key_exists('dc:creator', $ret_data)) {
$author = $ret_data['dc:creator'];
}
break;
case 'exec':
$subject = 'A new fortune for you on '.time();
$body = shell_exec($loc);
break;
default:
echo "Could not determine type!\n";
return;
}
// Assemble the message
if (isset($author)) {
$from = $author.' <linux@tamu.edu>';
} else if (isset($must_be_from)) {
$from = $must_be_from;
} else {
$from = $from_addresses[rand(0,(sizeof($from_addresses)-1))];
}
$headers = 'From: '.$from."\r\n".
'X-Mailer: PHP/'.phpversion()." \r\n".
'X-Comment: Created from mass_mailer_script.php'."\r\n";
if (isset($must_be_to)) {
$to = $must_be_to;
} else {
$to = $to_addresses[rand(0,(sizeof($to_addresses)-1))];
}
echo "The following message: \n".
"\tFrom: ".$from."\n".
"\tTo: ".$to."\n".
"\tSubject: ".$subject."\n";
if (mail($to, $subject, $body, $headers)) {
echo "was sent successfully!\n\n";
} else {
echo "was unsuccessful!\n\n";
}
// sleep(5);
}
// START HERE
for ($i = 0; $i < 30000; $i++) {
gen_email();
}
?>
<?php
function get_article_from_feeds() {
// Listing of Feeds I'd like to retrieve from
$feeds = array(
'http://planet.gentoo.org/rss20.xml',
'http://slashdot.org/index.rss',
'http://slashdot.org/apple.rss',
'http://slashdot.org/linux.rss',
'http://slashdot.org/science.rss',
'http://slashdot.org/developers.rss',
'http://slashdot.org/askslashdot.rss',
'http://www.phparchitect.com/discuss/rdf.php?mode=m&l=1&basic=1',
'http://hades.phparch.com/hermes/feednews/index.php',
'http://www.nytimes.com/services/xml/rss/nyt/Business.xml',
'http://www.nytimes.com/services/xml/rss/nyt/Arts.xml',
'http://www.nytimes.com/services/xml/rss/nyt/Science.xml',
'http://www.nytimes.com/services/xml/rss/nyt/Technology.xml',
'http://www.nytimes.com/services/xml/rss/nyt/WeekinReview.xml',
'http://rss.news.yahoo.com/rss/topstories'
);
// Figure out which one we want to use
$chosen_feed = $feeds[rand(0,(sizeof($feeds)-1))];
echo "Chosen feed is: ".$chosen_feed."\n";
// Open the feed using fopen
$feed = fopen($chosen_feed, "r");
if (!$feed) {
echo "There was a problem opening the feed!\n";
exit();
}
include('XML/RSS.php');
// download and parse RSS data
$rss =& new XML_RSS($chosen_feed);
$rss->parse();
// print headlines
// print_r($rss->getItems());
$articles = $rss->getItems();
return $articles[rand(0,(sizeof($articles)-1))];
}
?>