Friday, 3 July 2009

Using Perl to create graph and analyse it

Perl is a brilliant language! it is simple, very straight forward, you can get started in no time!

Only two days of quick rush through of some basics of Perl, I managed to find and use some module code from CPAN. I started to parse an XML, then use the data built a graph and ready to do some advanced graph calculation and analysis!

Today, I experimented on parsing a small portion of an XML file (5 papers with 10 authors), then successfully created a graph out of it. With the methods already provided by the Graph module, I can easily query the average path length of the graph and so on! It turned out my small graph has an average path length of 1.54!

I am really looking forward to try this on a bigger XML and do more interesting analysis.

Thursday, 2 July 2009

More Perl and some reading

Today, I went ahead to use Perl to do some work.

I exported 50 paper in XML format from the ECS eprints, then used XML::Simple module from CPAN to parse and try to get some data out of the XML file.

I successfully printed out the author's name, id and paper's title in the given XML file in the format I wanted -- This means I am able to get data out of a XML file now, and I should start thinking about the network attributes I want to calculate for my research, and how to write program to calculate them.




For the rest of the day, I started reading instead, found some paper that talked about what to calculate to characterise a network and why and how (Newman, M. The structure and function of complex networks Arxiv preprint cond-mat/0303516, 2003). I can see myself start write some program to calculate them tomorrow.jh

Wednesday, 1 July 2009

Learning Perl

Another hot day here in Southampton.

I had a project meeting with Dr Carr today to report my progress and given some suggestion.

My big plan is sound: Gather dataset of University staff into some data structure -> analyse the data (How to analyse them requires more reading), generate some diagrams.

From the way Dr Carr suggested, a scripting language like Perl, python would be more suitable than a programming language like C in this context, so I had a look at them this afternoon. A quick compare made me to stick to Perl: It is a good text manipulator language(I always wanted to learn one), it has more resource for learning and more active from the impression I get from the search results.

Started to follow a tutorial this afternoon, wrote a couple of scripts, used several variable types: hash, scalar, array, and filehandler. Everything goes smoothly, I should be able to start using it in a day or two.

Once I get a hand of Perl, I should start to practice analysing and manipulating small amount of people exported from the ECS eprints.