Pretty Print JSON From The Command Line

JSON can come in all shapes and sizes, and sometimes you want to see it in a structured format that’s easier on the eyes. This is called pretty printing. But how do you accomplish that, especially if you have a really large JSON file? While there are some converter tools online to show json files in a pretty format, they can get very slow and even freeze if your file is too large. Let’s do it locally.

Requirements: Have Python installed, and a terminal environment.

cat file.json | python -m json.tool > prettyfile.json

And that’s all there is to it! Python comes built in with a JSON encoding/decoding library, and you can use it to your advantage to get nice formatted output. Alternatively, if you are receiving JSON from an API or HTTP request, you can pipe your results from a curl call directly into this tool as well.

Hope this helps!

A Quick Introduction to HDFS

The Hadoop filesystem is called HDFS, and today I’m going to give a short introduction to how it works for a beginner.

The Hadoop File System (HDFS) sits on top of a Hadoop cluster and facilitates the distributed storage and access of files.  When a file is stored in HDFS, it is split into chunks called “blocks”. They can be of different sizes. The blocks are scattered between the nodes. These nodes have a daemon running called a datanode. There is one node called the namenode that has metadata about the blocks and their whereabouts.

To protect against network or disk failure data is replicated in three places across the cluster. This makes the data redundant. Therefore if one datanode goes down, there are other copies of the data elsewhere. When this happens a new copy of the data is created, so that there are always three.

The namenode is even more important, because it has metadata about all the files. If there is a network issue, all of the data will be unavailable. However, if the disk on the namenode fails, the data may be lost forever, because the namenode has all the information about how the pieces of the files go together. We’d still have all the chunks on the data nodes, but we’d have no idea what file they go to.

To get around this issue, one solution is to also mount the drive on a network file system (NFS). Another way to approach this (which is a better alternative) is to have an active namenode and a standby namenode. This way, there is a “backup” if something goes wrong.

Some commands:
  • To list files on HDFS:
    • $ hadoop fs -ls
  • To put files on HDFS:
    • $ hadoop fs -put filename
    • this takes a local file and puts on HDFS
  • To display the end of a file :
    • $ hadoop fs -tail filename
  • Most bash commands will work if you put a dash in front of them
    • $ hadoop fs -cat
    • $ hadoop fs -mv
    • $ hadoop fs -mkdir
    • etc…

Halloween Themed Math Puzzles

free-halloween-powerpoint-background-8

Happy Halloween From  The Muse Garden!

In the tradition of my Holiday Math Puzzles, I’m here with an appropriately themed puzzle for this time of year.

Candy Distribution

Halloween-Candy1

It’s that time of year all right. You’re out and about, trick or treating with your friends or family and when you come home, you decide to dump all the candy out on the floor to sort through it. But then, as siblings often do, you begin to bicker about who has “more” than the other. In fact, there are some candies that you really like, and some you don’t like. You’d rather have a bunch of chocolate than a bunch of peppermints, for instance.  But wait a minute! Of course you can’t like the same thing! Your sibling actually likes peppermints!

Here’s a table of the different candies you have. Each candy has a “value” to it; that is, how much you “want” it. Try to split up the candies such that you and your sibling both have as equal value as possible at the end. And no fighting!

Candy Quantity Your Value (per piece) Sibling’s Value (per piece)
Candy Corn 150 25 50
Peppermints 50 5 50
Peanut Butter Cups 10 100 75
Hershey Bars 25 50 10
Kit Kat 20 75 30

4 Is The Magic Number – Riddle

So fishes56 alerted me to this riddle in a comment here, and if you didn’t see it I wanted to share it here in a separate post cause its a little fun thing to think about.

12 is 6, 6 is 3, 3 is 5, 5 is 4, 4 is 4.

4 is the magic number, why?

68 is 10, 10 is 3, 3 is 5, 5 is 4, 4 is 4.

4 is the magic number, why?

26 is 9, 9 is 4, 4 is 4. 4 is the magic number, why?

Give it a try. 🙂

The Collatz Conjecture and Hailstone Sequences: Deceptively Simple, Devilishly Difficult

Here is a very simple math problem:

If a number is even, divide the number by two. If a number is odd, multiply the number by three, and then add one. Do this over and over with your results. Will you always get down to the value 1 no matter what number you choose?

Go ahead and test this out for yourself. Plug a few numbers in. Try it out. I’ll wait.

Back? Good. Here’s my example: I start with the number 12.

12 is even, so I divide it by 2.

12/2=6.

6/2=3.

3 is odd, so we multiply it by 3 then add 1.

(3*3)+1=10.

10/2=5.

(5*3)+1=16.

16/2=8.

8/2=4.

4/2=2.

2/2=1.

And we have arrived at one, just like we thought we would. But is this ALWAYS the case, for ANY number I could possibly dream of?

This may sound easy, but in fact it is an unsolved mathematics problem, with some of the best minds in the field doing research on it. It’s easy enough to check low values, but numbers are infinite, and can get very, very, very large. How do we KNOW that it works for every single number that exists?

That is what is so difficult about this problem. While every single number we have checked (which is a LOT) ends up at 1 sooner or later (some numbers can bounce around for a very long time, hence why they are called “hailstone sequences”), we still have no method to prove that it works for every number. Mathematicians have been looking for some method to predict this, but despite getting into some pretty heavy mathematics in an attempt to attack this problem, we still do not know for sure.

This problem is known as the Collatz Conjecture and is very interesting to mathematicians young and old because it is so easy to explain and play with, yet so tough to exhaustively prove. What do you think? Will this problem ever be solved? And what would be the implications if it was?

Here is some simple Python code that can display the cycles for any number you type in:

#!/usr/bin/python
num = int(input("Enter a number: "))
while num!=1:
if num%2 == 0:
num = num/2
else:
num = (num*3)+1
    print num