Tuesday, January 24, 2012

IRC message Regex

So I'm in the process of making an irc bot, and one of the problems I always seem to have is parsing the message. If you don't know what parsing means, the simplest explanation is changing the message, in this case breaking it down into some form of an object that organizes the data for it.

Normally I would split it and then check various information from it being split then check all the split parameters and try to organize it, however when searching recently I found an easier way. Now keep in mind, this is client side parsing. I found the original on Caleb Delnay, so original credit goes here, but I wanted to expand upon it and convert it to work with the other subtle changes in regex across other languages.

*** There were further improvements added by someone else who I found thanks to some referrals in the stats thanks to a lovely accreditation to this post. Check out more info at mybuddymichael.com. I'll try to integrate the changes into my post once I can take some time to give it a good lookover and try it out a bit (hopefully it will help out with a new bot I'm currently working on in Haskell).

The original .NET compatible regex is
^(:(?<prefix>\S+) )?(?<command>\S+)( (?!:)(?<params>.+?))?( :(?<trail>.+))?$

Python (place in an r"" string so you won't need to escape backslashes), Perl, PHP, and AS3:
^(:(?P<prefix>\S+) )?(?P<command>\S+)( (?!:)(?P<params>.+?))?( :(?P<trail>.+))?$

Java (before 7 didn't support named groups, want to look at groups 2, 3, 5 and 7):
^(:(\\S+) )?(\\S+)( (?!:)(.+?))?( :(.+))?$

Java (7 and up supports named groups, have not tried this yet):
^(:(?<prefix>\\S+) )?(?<command>\\S+)( (?!:)(?<params>.+?))?( :(?<trail>.+))?$

JavaScript (no named grouping, use groups 2, 3, 5, and 7, does not need to be in a string):
/^(:(\S+) )?(\S+)( (?!:)(.+?))?( :(.+))?$/

The basic premise is under the assumption messages are formatted along the lines of :<prefix> <command> <params> :<trailing>, where any values are optional. If you know a better way to do any of them are know ways in languages I left out, let me know. As far as the regex methods and ways to work it out, that is up to you, I am just supplying the pattern and it is up to you so use it correctly.

Since regex can be complicated, hopefully this saves everyone some time, figuring out the needed methods to use it shouldn't be too hard.

After a comment about some stuff in the RFC, I played around with trying to make the regex work with that specification, I came up with a partially working version. Due to the complexity, my lack of knowledge and lack of benefit from this, I will only post the one edit I made and hopefully not bother with this again. While this is good for something quick and dirty, string methods seem to be more practical.

^(:(?P<prefix>\S+) )?(?P<command>\S+)( (?!:)(?P<params>\S{14} (:)?|.+ :?))?((?P<trail>.+))?$

The params section will end up with either a trailing space or a space and colon. That's the best I could do, and the last I'll do of this.

Wednesday, January 18, 2012

Javascript Ajax and keeping order

The title is about as descriptive as I could come up with for this, but allow me to elaborate. On a forum I code for, basically my coding is fixing and editing shit that's either broken or unsatisfactory, they decided to install a shoutbox. The shoutbox program worked with one major flaw. The Ajax requests for the shout were not being tracked and therefore jumping out of flow, which broke shit causing the script to shit itself and led to a bunch of problems down to my browser throwing an error with the socket, posts duplicating, just pure stupidity really. Upon hunting for the error, it was blatantly obvious and something I never thought I'd run into.

The metaphor of it goes like this... you send out messengers in regular intervals, but don't track them when you only need one messenger who is tracked. So all these anonymous messengers have information of what to get the update on, but it's only as recent as when they were sent. Therefore if they don't get back before the next one is sent for whatever reason, another messenger is sent for the same information. When they return, you now have double, possibly even triplicates of the same update info. So rather than handle that, it then dumps all the info in, duplicates and all, which then everything is overwhelmed, can't establish the right new request to make and then just shits itself.

Here's the remedy, and it should be the first thing you learn with Ajax requests. First, assign the object to a variable name. Then you check for two things, the readyState and the status. It's not done until the readyState is 4 and the status is 200. Once that is satisfied, then and only then do you send out a new request. This allows things that need to be done in order to be done in order.

if (xmlhttp.readyState == 4 && xmlhttp.status == 200) {
    // Process what to do when done.

You should preform this check even if you are waiting for the onreadystatechange event, as just because it has changed does not mean that it is done. Now for this particular script, it used a javascript framework called Prototype. When you create an Ajax object with that, the readyState and status are available in the object's transport object.

The bottom line is remember the flow and to control anything asynchronous, do not rely on delay or speed to maintain the flow in those cases.

Saturday, January 14, 2012

Python Slice and Stride notation

I have been looking a bit more into Python's slice notations, more specifically, the stride notation. If you don't know what this is, the basic premise of it works like this:


This also works with strings, and for the sake of example, I will use strings. The basics of how it works is like this:
>>> "01234"[:]
>>> "01234"[1:]
>>> "01234"[:4]
>>> "01234"[:-1]
>>> "01234"[-1:]

Now those are basic slice notations with no stride. The stride, or step, determines how things display, and using negatives in the stride reverses the string or list, so the first example will be a quick and easy way to reverse a string.
>>> "01234"[:-1]
>>> "01234"[-1:]
>>> "01234"[::-1]
>>> "01234"[::2]
>>> "01234"[::-2]
>>> "01234"[::0]
Traceback (most recent call last):
  File "<pyshell#55>", line 1, in <module>
ValueError: slice step cannot be zero
Now so far things work as expected, so you can be pretty happy working easily with this. This is where things seem to get complicated. Using a slice and stride together. When using a positive stride, things work fine, like this.
>>> "01234"[1::2]
>>> "01234"[1:-1:2]
>>> "01234"[1:-2:2] # This slice would result in '23'
Now... let's take a peek at negatives, because right now it seems simple, like it slices then follows the stride, when using a negative stride, the slicing acts a bit differently.
>>> "01234"[1:-2:-1]
Just a quick analysis, the slice should yield 23, then be reversed, however this is not the case. To get those results, you instead need to do this.
>>> "01234"[-2:1:-1]
Since I do not know the specifics, I can't say as to why it is like this, seemingly need to reverse the indexes in the slice, but we're not done yet, it's very strange.
>>> "01234"[:1:-1]
>>> "01234"[:-5:-1]
>>> "01234"[:-4:-1]
Okay, now things are just plain backwards. Rather than omitting the negatives like in a normal slice, it instead uses the results that would be omitted. Now let's make things even more crazy.
>>> "01234"[1:-4:-1]
>>> "01234"[-1:-4:-1]
>>> "01234"[-1::-1]
>>> "01234"[-2::-1]
>>> "01234"[0::-1]
>>> "01234"[1::-1]
So when using a negative stride, it seems that you need to work backwards on the index, seeing as the stride does something with how the indexes are setup. Specifics as to this behavior I don't know, but if you do, let me know. The question then might arise, how can one avoid this strangeness of negative strides and indexes? Well, the easy way to get around it is chain together a slice and stride or visa versa so that it works in order to how you would understand.
>>> "01234"[:-4][::-1] # Same as "01234"[0::-1]
>>> "01234"[1:][::-1] # Same as "01234"[:0:-1]
>>> "01234"[::-1][1:] # Same as "01234"[-2::-1]
So that's a little look at slice notation and how things can get kind of weird with the indexes. I don't think there would be much use for the full expanded notation including a slice and a stride, however never know what someone might be trying to do. If nothing else, it can be used as a shorter way to reverse a list instead of using the reverse() function and a nice and easy way to reverse a string. Note that you can chain as many of them together as you wish, even off of a result returned from a function.

Thursday, January 12, 2012

Backtrack 4 Sounds

If you ever dabbled with some hacking tools, I'll bet you have at least heard of Backtrack. If not, it's a Linux distro full of penetration testing tools. It's also small and easy to boot, making it great to use for datarecovery with a live CD. However one of my favorite parts about it, mainly Backtrack 4 final, was the login sound. Seeing as I use Ubuntu and can easily change the sound, I embarked on my hunt to find the sound file to replace Ubuntu's with. Of course you can also use it on Windows if you know how to change it, but that's easier to find than the location of this damn file it seemed.

The problem was it's not in the expected folder (/usr/share/sounds). So I hoped on backtrack and had it search for the file, the one I wanted was KDE_Startup_1.ogg. I found Backtrack's sound files in /opt/kde3/share/sounds. So if you want to dig out any of Backtrack's sound files, that's where they are. Have yet to check the newer version. Either way, now you know where to pull the original sound file to where you can convert it, if it's not there, check the Control Center under system notifications. Then search for the startup sound, shouldn't be too hard to find. I probably would have found it quicker if I weren't being lazy and trying to find it through Ubuntu instead of checking directly on Backtrack when I didn't even know the file name.

Anyway, hopefully someone finds it helpful.

Tag Cloud

.NET (1) A+ (2) addon (6) Android (4) anonymous functions (5) application (10) arduino (1) artificial intelligence (2) bash (4) c (7) camera (1) certifications (4) cobol (1) comptia (4) computing (2) css (2) customize (16) encryption (2) error (19) exploit (17) ftp (3) funny (2) gadget (3) games (2) Gtk (1) GUI (5) hardware (7) haskell (15) help (8) HTML (6) irc (2) java (5) javascript (21) Linux (20) Mac (5) malware (2) math (8) network (9) objects (2) OCaml (1) perl (4) php (9) plugin (7) programming (42) python (24) radio (1) regex (3) security (25) sound (1) speakers (1) ssh (3) story (1) Techs from the Crypt (2) telnet (2) tools (15) troubleshooting (5) Ubuntu (4) Unix (4) virtualization (1) web design (14) Windows (8) wx (2)