Anyone got a website?
Anyone got a website?
And I mean: a real website which is generating standard Unix logfiles.
I just wrote a tool which is taking an unlimited amount of logfiles, reads them all, sorts every line by date/time and then writes the lines back to files using one file per month. I can finally have a look at the logfiles which AAW has created since 2001.
Anyone want to have such a thing?
I just wrote a tool which is taking an unlimited amount of logfiles, reads them all, sorts every line by date/time and then writes the lines back to files using one file per month. I can finally have a look at the logfiles which AAW has created since 2001.
Anyone want to have such a thing?
What do you call a dinosaur with an extended vocabulary? A thesaurus.
- Zeratul2k
- Captain Catnip
- Posts: 2261
- Joined: July 11th, 2007, 6:10 am
- Location: Holding low orbit over All Anime World
- Contact:
Re: Anyone got a website?
Oh, neat! Throw it my way, please? I might find a use for it soon enough.
So, Lone Star, now you see that evil will always triumph... because good is DUMB!
Re: Anyone got a website?
My tool has a problem reading files of > 1 GB
What do you call a dinosaur with an extended vocabulary? A thesaurus.
Re: Anyone got a website?
I'm not 100% happy with my 'tools'. I wrote three tools to help me with archiving, sorting, and editing/cleaning of my old logfiles. One tool is used to concatenate a number of log files. It searches for files and sorts them by file name, then reads every file and concatenate it in one single file. I use ConcatLog for small websites with very little traffic.
Another tool is doing the opposite: It splits up text files in parts for editing and searching. Try opening a 500 megabyte text file with Notepad. Get ready to press and hold then power button. By default it will create files of 100,000 lines each. I use SplitLog to fix illegal entries in logfiles (crashes or hacking attempts)
My masterpiece is SortLog and I'm very proud of it. It counts all files in the current directory starting with 'access' and reads them line by line to find the earliest and latest dates in every logfile. It will determine the oldest date and start reading all files containing log entries for the given month, read all matching lines in memory while eliminating duplicates, sorting them by access time and saving them to single files, then repeating for every month which has entries in any of the logfiles.
So, why I'm not happy... The tool is using windows-1251 encoding for the code page, which is basically Microsoft's version of ANSI code. I'm not sure what happens with international domains. Second, memory usage. I would like to sort the lines for every month which means I have to keep all the lines for a single month in memory. If you have a major website you might run out of memory or have a very poor performance while Windows enlarges your paging file. I'm unsure what crashes your computer first: Me, Windows, or your logfiles? Third, it does some kind of converting to meet the windows-1251 codepage criteria. If you have special chars as arguments it might "work" with them. A possible solution would be, to treat all files as UTF8 and work with Unicode internally. If I understand this Unicode thing correctly this is going to double the amount of memory needed to run the software...
I feel really geekish today. I had pizza and am wearing dirty jeans. There goes whatever is left of my cuteness
Another tool is doing the opposite: It splits up text files in parts for editing and searching. Try opening a 500 megabyte text file with Notepad. Get ready to press and hold then power button. By default it will create files of 100,000 lines each. I use SplitLog to fix illegal entries in logfiles (crashes or hacking attempts)
My masterpiece is SortLog and I'm very proud of it. It counts all files in the current directory starting with 'access' and reads them line by line to find the earliest and latest dates in every logfile. It will determine the oldest date and start reading all files containing log entries for the given month, read all matching lines in memory while eliminating duplicates, sorting them by access time and saving them to single files, then repeating for every month which has entries in any of the logfiles.
So, why I'm not happy... The tool is using windows-1251 encoding for the code page, which is basically Microsoft's version of ANSI code. I'm not sure what happens with international domains. Second, memory usage. I would like to sort the lines for every month which means I have to keep all the lines for a single month in memory. If you have a major website you might run out of memory or have a very poor performance while Windows enlarges your paging file. I'm unsure what crashes your computer first: Me, Windows, or your logfiles? Third, it does some kind of converting to meet the windows-1251 codepage criteria. If you have special chars as arguments it might "work" with them. A possible solution would be, to treat all files as UTF8 and work with Unicode internally. If I understand this Unicode thing correctly this is going to double the amount of memory needed to run the software...
I feel really geekish today. I had pizza and am wearing dirty jeans. There goes whatever is left of my cuteness
- Attachments
-
- SortLog.zip
- Toolpack for large logfiles. Sorry for any german output. No warranty. If you find a problem you may keep it. Keep out of reach from children and/or people with less than 2 GB of RAM.
- (508.33 KiB) Downloaded 145 times
What do you call a dinosaur with an extended vocabulary? A thesaurus.
Re: Anyone got a website?
Uhm I forgot to mention the main advantage of my SortLog tool: I can "mix" the logfiles of bbs.allanime.org and www.allanime.org and thus get the traffic calculated for both websites at once.
What do you call a dinosaur with an extended vocabulary? A thesaurus.
- Hiki
- Honorary Evil Kitty
- Posts: 2946
- Joined: July 10th, 2007, 12:05 pm
- Location: ☆Court of Miracles☆
- Contact:
Re: Anyone got a website?
I'm sure you're still very cute! Even with your dirty jeans
Don't feed me violins.
Re: Anyone got a website?
Hmmmm, dirty jeans and a pizza, sounds really cute to me. I haven't a clue about your tools, mine are mostly made of steel.
Dogs have owners, Cats have staff
Some mistakes are too much fun to only make once.
Some mistakes are too much fun to only make once.
Re: Anyone got a website?
Perhaps hair and slide...
What do you call a dinosaur with an extended vocabulary? A thesaurus.
Re: Anyone got a website?
I don't like girls in dirty jeans, not cute at all; I'd get rid of these evil dirty clothes instantly. But that's probably not the topic here.
Re: Anyone got a website?
I only have two jeans and I just did my laundry so there is a good chance I'm not wearing one this week.
Anyone tried my tool(s)?
Anyone tried my tool(s)?
What do you call a dinosaur with an extended vocabulary? A thesaurus.