Sunday, January 19, 2014

Bulk File Renaming

While in the process of transferring music and shows to a media server, I noticed an annoying problem. All the file names were formatted differently than I wanted. After manually changing files in one folder and realizing how long it took, I thought there must be an easier way. After a very short Google search, I figured what a better way to support what I need than to make such a program myself. So I created rename.py. It is a command line utility that allows bulk renaming of files and/or directories, supporting basic stripping, regular expression matching and Python string formatting.

You can find the download link for it here.

Now here is a quick rundown on ways to use it.

#strips first 4 chars off every file
./rename.py 4
#strips in the Music/5FDP folder
./rename.py -p Music/5FDP 4
# Takes any file in Music/RATM that looks like "something - somethingelse"
# and changes it to RATM.somethingelse
./rename.py -p Music/RATM -r ".+ - (.*)" "RATM.{0}"
# help for more info
./rename.py -h

There are other options like -d for including directories and -o for only directories. Named groupings are also supported, for more information, it would be best to look into regular expressions and Python's str.format() to fully utilize this tool.

Thursday, January 16, 2014

Laziness and Non-Strict Programming

I found myself buried in some PHP code recently with completely redoing my site. On top of that, I was also patching another site who had switched to some new software. On the new software I was patching, the error logs were massive, but from the front end everything seemed all right. A lot of errors emerged from pure laziness causing inconsistencies and I resorted to very messy quick fixes due to a lacking desire to read and edit everything.

A non-strict program will allow the use of variables that have not been declared, either giving a warning or silently ignoring with an error and continuing on. The biggest problem these cause is when doing conditional testing. In a language like C, everything must be declared, even if it is not used when it runs, any reference means it needs to be declared. In other languages, it may result in a warning or some may ignore it all together if it is never used.

In most interpreted languages, an if statement takes on a bit of lazy behavior, in that if a condition isn't needed to finish, it isn't evaluated. For example, the statement $a = true; if ($a || $b);, the variable $b will not be needed as regardless of its value, the end result will be true. Because of this, nothing will happen with $b unless $a is false. So unless every possible condition is tested, an error or inconsistency could be lurking in the shadows. Due to this, an undeclared variable used to result in a False, 0 or empty string, depending on what it is being used for. However, with a strict behavior, this is not the case.

Languages like Perl and Javascript >=5, have an option to use strict semantics. The end result is either the code adheres to strict standards or it fails all together, as apposed to silently shirking it and continuing. PHP can also enable strict standards, but the bottom line is that all it will do is corner the programmer.

Regardless of a strict mode or not, a programmer should always try to declare all variables correctly and not rely on fall-backs to pick up any oversights. A programmer should also understand that just because a program appears to work, does not mean it is actually working correctly. One thing I find a double-edged sword in PHP is the isset() function. While it makes for easy patching, it also means down the road it could lead to bulkier code always checking if a variable is set or difficult to decipher code when trying to find where a variable is actually declared, and why.

Now before someone jumps on the bandwagon of something like being "Pythonic", you can actually easily emulate an isset function, or set up a try/catch structure to silently ignore such things.

def isset(a):
    return a in globals()

This is just a rough version, pass the variable as a string and it will check for a global variable declared with that name. It can be altered to check further, just avoid getting stuck in a looping reference. The same can be done in other interpreted languages where such a structure is visible.

There are always ways to be lazy or make programming seem non-strict, but the bottom line is that the large-scale effort and the complications are not worth the possible benefits, whatever people may see in it.

The main reason I can see people finding a use for it is in user input. User input and IO tasks in functional programming languages are considered unreliable and with side effects, and I agree. For this reason, I feel that a proper way to deal with it is assuming Murphy's Law.

Tag Cloud

.NET (1) A+ (1) addon (6) Android (3) anonymous functions (5) application (9) arduino (1) artificial intelligence (2) bash (3) c (7) camera (1) certifications (1) cobol (1) comptia (2) computing (2) css (2) customize (15) encryption (2) error (15) exploit (13) ftp (2) gadget (2) games (2) Gtk (1) GUI (5) hardware (6) haskell (15) help (5) HTML (4) irc (1) java (5) javascript (20) Linux (18) Mac (4) malware (1) math (8) network (5) objects (2) OCaml (1) perl (4) php (8) plugin (6) programming (42) python (24) radio (1) regex (3) security (21) sound (1) speakers (1) ssh (1) telnet (1) tools (11) troubleshooting (1) Ubuntu (3) Unix (4) virtualization (1) web design (14) Windows (6) wx (2)