Thursday, August 25, 2016

יום פקודה

Command and Conquer


echo "olleh#dlrow~" |rev|cut -d~ -f2|sed "s/#/ /"|awk '{print $2 " " $1}'
Can you read this? great!

(By the way - some code, so no Hebrew)
When I got out of the university, I had exactly zero familiarity with the linux shell, and found myself in an environment that heavily relied on this exact user interface. I learned pretty quickly that "less" is a way to read a file, and that "ls" is the equivalent of "dir", but that was mostly it.  It took me several months to start and see the vast world of the command-line utilities that come built in with linux (or are easily available), and since then I get every now and then reminder of how powerful those utilities are. Lately I got a quick series of reminders, so I knew I should pass the message around and remind everyone else of this fact as well.
What I really like about those utilities is that it almost does not matter which one you'll choose, it deserves a separate post all by itself (and, while we are at it, here's a great post on sed). But - despite having some complexity to dig into, most utilities are simple to use and have a single goal that can be described in a short sentence (with the exception of "awk" that seems to do way to much). The combination of simple and powerful makes them a great tool to use and a huge time saver.

So, inspired by a short session passed by Natalie Bennet during TestRetreat, A short list of my favorite command line tools, in no particular order.

  • XmlStarlet - this tool is my most recent reminder of why I really like those seemingly small utilities. It does not come out of the box with every linux distribution, but it's a powerful tool for editing XML files. If you ever tried to edit an unstructured XML in a programming language (such as Java) you know it can be a bit cumbersome. And I had a list of identical elements that I wanted to change their ID so that it will be unique. I wrote something very similar to the following three lines, and lo and behold - the file was updated just the way I wanted it (note - sometimes the command is called "xml", depending on the way you installed it)
    for i in {1..30}
    do 
       xmlstarlet ed -L -u "/root/testElement[$i]" -v text$i testFile.xml 
    done
  • xmllint - While I am speaking of xml tools, xmllint is what I use to search within an xml file, or to validate that the file structure was not corrupted.
    # if there is no output from this line - the XML is valid.
    xmllint --noout myFile.xml
    #opens a shell where I can ask to print a specific element described by an Xpath expression
    xmllint --shell myFile.xml
  • grep - Yet another powerful tool. In principle, it does a really simple thing - it filters text to help you find stuff. By default you get the whole line that matches the search condition (which is a regular expression), although you can set it to output lines that do not match the condition, or to output only the match itself.
    This will look a like this:

    echo "hello world \n and to you too"|grep hello  --> hello world
    echo "hello world \n and to you too"|grep -o hello  --> hello
    echo "hello world \n and to you too"|grep --color hello  --> *hello* world (the asterixes are to mark "bold")
    echo "hello world \n and to you too"|grep -v hello  --> and to you too
  • find - Simple, yet powerful. With the correct flags you can find a file, a directory or a link with specific name, or that was created before or after a certain period of time. 
  • xargs - this isn't a utility by itself, but rather it's  an instruction to linux to provide the output of a previous command as an input to the next one. So, for example:
    find . -name *.java |xargs grep --color "searchString"
    will find all of the places where "searchString" appears in any java file. Running the same command without the xargs instruction, will just result in finding all of the java files that have "searchString" in their name. 
  • sed - a simple way to do search and replace. Do read the post linked above for more details. 
  • du - Gives information about memory taken by files (du is disk-usage). Most of the time, the only proper way to use it will be
    du -h --max-depth=1
    This will give you the results in human readable form, and will not dive any deeper than one folder down, so you won't end up having a million lines that indicate a file of 200Kb. Very useful when cleaning up space on your hard-drive. 
It's getting a bit long, and this post isn't really "all you need to know about the linux command line", so I'll skip other utilities I'm using frequently (such as cp, mv, less, vi, wget and some others) and instead I want to share the story of how I learned - the hard way - the power of the command line utilities. 
I was quite new at work - less than a year after I finished university, and we were doing a security ramp up for our application. Since we deal with payment cards we are bound by PCI-DSS, so we have a handy security requirements document to refer to if we're out of ideas ourselves. This time chose to focus on the requirement never to print card numbers to the logs, even in debug logs. So I set to the task - create a check that will help us spot card numbers in the logs. Sure, why not. We were (and still are) writing our automated checks in java, and I set out to add another scenario to those tests - I performed some activities with several cards, then went about thinking of looking for a way to scan the files for card numbers. The check ran, and even found a place or two where we had card numbers printed. The test was a couple of hundreds lines long, and gave us a very small sample of the actual activities in our system. I wasn't very happy with the result. When I asked for advice from a more experienced tester in my team he asked me "why don't you write a script?" So I did. I wrote all of the card numbers we use in our system to a file, and wrote the following script (I ommitted some variable definitions here, since they are not that interesting) - 
#search for PAN files in logs  directory
for fullFileName in $(find ${LOGS_DIR} -type f -follow); do 
        for card in $(egrep -a -o "${searchPattern}" $fullFileName||uniq); do
                echo $fullFileName $card >>${resFile}
                filename=`echo ${fullFileName}|rev|cut -d\/ -f1|rev`
                cp ${fullFileName} ${resDir}${filename}_${card}
                wasFound="Y"
        done
done

### SEND E-mail#####

if [ "${wasFound}" = "Y" ];
then
        (echo -e "PAN was found in logs in machine: ${HOSTNAME}.\n copies of the logs are kept in ${resDir} (copies are kept as <fileName_date>_<PAN>). \n\n PANs found are:\n";cat ${resFile})|sendmail myemail@workplace.com;
        echo "PAN found"
fi

Basically, what I'm doing is to loop over the files and in each search for card numbers (PAN stands for "personal account number"). I then copy each of the resulting files and adding the card number as a suffix to the file name, so that I'll know what to look for when investigating. 

Do you see how short is this piece of code? and it covers a whole lot more ground than the automated check I wrote in java and was several hundred lines. Deleting the java code was a strong lesson (and to my surprise, I really enjoyed deleting a week's worth of effort).

So, if you are not familiar with the command line - invest some time in getting to know it a bit. It will probably open a whole new world of options for you. 


P.S. 
If you are using windows, you probably don't have access to linux commands. The way I hear it, powershell is as powerful. 






No comments:

Post a Comment