Tuesday, January 24, 2012

IRC message Regex

So I'm in the process of making an irc bot, and one of the problems I always seem to have is parsing the message. If you don't know what parsing means, the simplest explanation is changing the message, in this case breaking it down into some form of an object that organizes the data for it.

Normally I would split it and then check various information from it being split then check all the split parameters and try to organize it, however when searching recently I found an easier way. Now keep in mind, this is client side parsing. I found the original on Caleb Delnay, so original credit goes here, but I wanted to expand upon it and convert it to work with the other subtle changes in regex across other languages.

*** There were further improvements added by someone else who I found thanks to some referrals in the stats thanks to a lovely accreditation to this post. Check out more info at mybuddymichael.com. I'll try to integrate the changes into my post once I can take some time to give it a good lookover and try it out a bit (hopefully it will help out with a new bot I'm currently working on in Haskell).

The original .NET compatible regex is
^(:(?<prefix>\S+) )?(?<command>\S+)( (?!:)(?<params>.+?))?( :(?<trail>.+))?$

Python (place in an r"" string so you won't need to escape backslashes), Perl, PHP, and AS3:
^(:(?P<prefix>\S+) )?(?P<command>\S+)( (?!:)(?P<params>.+?))?( :(?P<trail>.+))?$

Java (before 7 didn't support named groups, want to look at groups 2, 3, 5 and 7):
^(:(\\S+) )?(\\S+)( (?!:)(.+?))?( :(.+))?$

Java (7 and up supports named groups, have not tried this yet):
^(:(?<prefix>\\S+) )?(?<command>\\S+)( (?!:)(?<params>.+?))?( :(?<trail>.+))?$

JavaScript (no named grouping, use groups 2, 3, 5, and 7, does not need to be in a string):
/^(:(\S+) )?(\S+)( (?!:)(.+?))?( :(.+))?$/

The basic premise is under the assumption messages are formatted along the lines of :<prefix> <command> <params> :<trailing>, where any values are optional. If you know a better way to do any of them are know ways in languages I left out, let me know. As far as the regex methods and ways to work it out, that is up to you, I am just supplying the pattern and it is up to you so use it correctly.

Since regex can be complicated, hopefully this saves everyone some time, figuring out the needed methods to use it shouldn't be too hard.



***Edit:
After a comment about some stuff in the RFC, I played around with trying to make the regex work with that specification, I came up with a partially working version. Due to the complexity, my lack of knowledge and lack of benefit from this, I will only post the one edit I made and hopefully not bother with this again. While this is good for something quick and dirty, string methods seem to be more practical.

^(:(?P<prefix>\S+) )?(?P<command>\S+)( (?!:)(?P<params>\S{14} (:)?|.+ :?))?((?P<trail>.+))?$

The params section will end up with either a trailing space or a space and colon. That's the best I could do, and the last I'll do of this.

Wednesday, January 18, 2012

Javascript Ajax and keeping order

The title is about as descriptive as I could come up with for this, but allow me to elaborate. On a forum I code for, basically my coding is fixing and editing shit that's either broken or unsatisfactory, they decided to install a shoutbox. The shoutbox program worked with one major flaw. The Ajax requests for the shout were not being tracked and therefore jumping out of flow, which broke shit causing the script to shit itself and led to a bunch of problems down to my browser throwing an error with the socket, posts duplicating, just pure stupidity really. Upon hunting for the error, it was blatantly obvious and something I never thought I'd run into.

The metaphor of it goes like this... you send out messengers in regular intervals, but don't track them when you only need one messenger who is tracked. So all these anonymous messengers have information of what to get the update on, but it's only as recent as when they were sent. Therefore if they don't get back before the next one is sent for whatever reason, another messenger is sent for the same information. When they return, you now have double, possibly even triplicates of the same update info. So rather than handle that, it then dumps all the info in, duplicates and all, which then everything is overwhelmed, can't establish the right new request to make and then just shits itself.

Here's the remedy, and it should be the first thing you learn with Ajax requests. First, assign the object to a variable name. Then you check for two things, the readyState and the status. It's not done until the readyState is 4 and the status is 200. Once that is satisfied, then and only then do you send out a new request. This allows things that need to be done in order to be done in order.

if (xmlhttp.readyState == 4 && xmlhttp.status == 200) {
    // Process what to do when done.
}

You should preform this check even if you are waiting for the onreadystatechange event, as just because it has changed does not mean that it is done. Now for this particular script, it used a javascript framework called Prototype. When you create an Ajax object with that, the readyState and status are available in the object's transport object.

The bottom line is remember the flow and to control anything asynchronous, do not rely on delay or speed to maintain the flow in those cases.

Tag Cloud

.NET (2) A+ (5) ad ds (1) addon (4) Android (4) anonymous functions (1) application (9) arduino (1) artificial intelligence (1) backup (1) bash (6) camera (2) certifications (3) comptia (5) css (2) customize (11) encryption (3) error (13) exploit (5) ftp (1) funny (4) gadget (4) games (3) GUI (5) hardware (16) haskell (6) help (14) HTML (3) imaging (2) irc (1) it (1) java (2) javascript (13) jobs (1) Linux (19) lua (1) Mac (4) malware (1) math (6) msp (1) network (13) perl (2) php (3) plugin (2) powershell (8) privacy (2) programming (24) python (10) radio (2) regex (3) repair (2) security (16) sound (2) speakers (2) ssh (1) story (5) Techs from the Crypt (5) telnet (1) tools (13) troubleshooting (11) tutorial (9) Ubuntu (4) Unix (2) virtualization (2) web design (6) Windows (16) world of warcraft (1) wow (1) wx (1)