Today I had to do a Perl program to poke inside an XML file and read out some values. I had a quick go with the CPAN module XML::Simple but found it to be horrible. Yuck. I ended up using XML::MyXML instead and I must say I’m much more impressed with it. I also wrote some test code just to be sure everything was ok and guarantee it doesn’t break if someone changes the XML. Here is a quick roundup of what I did.
This was the XML I had:
Param1
domain
Filetype1
machine1
machine1
machine2
/export/home/user/file1.tmp
Thursday, 29 January 2009 08:45:05 GMT
FLOWING
Only the attributes “Destinations” and “File_Names” are more complicated because they can contain multiple entries. Everything else is pretty straight forward, even using XML::Simple.
The Main Program
Here is the main file I did. It consists basically of opening the XML file, reading it and calling one of the functions that reads one of the attributes of the XML, in this case, get_filetype.
#!/usr/bin/perl -w
use strict;
use XML::MyXML qw(tidy_xml xml_to_object);
use XML::MyXML qw(:all);
require "readXML.pl";
sub main() {
my $file = "test.xml";
# this is needed because we read the whole file at once
my $holdTerminator = $/;
undef $/;
open(FILE, "<$file") || die "Could not open file: $1";
my $xml = ;
#print tidy_xml($xml); # uncomment to get a nice view of the XML
my $obj = xml_to_object($xml);
print get_filetype($obj) . "\n";
}
main();
The XML Reading
The next file is here I’ve abstracted all the functions that are reading values from the XML object we created earlier. It’s basically the XML reading part. I like the way XML::MyXML allows the reading of values by path. That’s cool. The first two sub-routines return arrays and the rest returns a string.
#!/usr/bin/perl -w
use strict;
use XML::MyXML qw(tidy_xml xml_to_object);
use XML::MyXML qw(:all);
# returns an array with the destinations
sub get_destinations {
my $obj = shift;
my @destinations;
foreach my $d ($obj->path('Mandatory_Info/Destinations/Destination')) {
push @destinations, $d->value;
}
return @destinations;
}
# returns an array with the filenames
sub get_filenames {
my $obj = shift;
my @filenames;
foreach my $d ($obj->path('Mandatory_Info/File_Names/File')) {
push @filenames, $d->value;
}
return @filenames;
}
sub get_filetype {
my $obj = shift;
return $obj->path('Mandatory_Info/File_Type')->value;
}
sub get_monitor {
my $obj = shift;
return $obj->path('Mandatory_Info/Monitor')->value;
}
sub get_domain {
my $obj = shift;
return $obj->path('Mandatory_Info/Domain')->value;
}
sub get_source {
my $obj = shift;
return $obj->path('Mandatory_Info/Source')->value;
}
sub get_submissiontime {
my $obj = shift;
return $obj->path('Mandatory_Info/Submission_Time')->value;
}
sub get_filestatus {
my $obj = shift;
return $obj->path('Optional_Info/File_Status')->value;
}
# needed to use a require when including this file
return 1;
Test Code and Data
And following is the test code I wrote. Not a big deal but it makes easy to check if the functions are working correctly. I choosed to embed a test XML in this file instead of reading it from a file.
#!/usr/bin/perl -w
use strict;
use XML::MyXML qw(tidy_xml xml_to_object);
use XML::MyXML qw(:all);
require "readXML.pl";
my $xml = "
Param1
domain
tmp
machine1
machine1
machine2
/export/home/user/file1.tmp
Thursday, 29 January 2009 08:45:05 GMT
FLOWING
";
my $obj = xml_to_object($xml);
run_tests($obj);
# Test sub-routine to assert that we are reading the correct
# values from the XML. Use only with $test_xml.
sub run_tests {
my $obj = shift;
my @destinations = get_destinations($obj);
assert ($destinations[0] , "machine1", "get_destinations");
assert ($destinations[1] , "machine2", "get_destinations");
my @file_names = get_filenames($obj);
assert ($file_names[0] , "/export/home/user/file1.tmp",
"get_filenames");
my $filetype = get_filetype($obj);
assert ($filetype , "tmp", "get_filetype");
my $monitor = get_monitor($obj);
assert ($monitor , "Param1", "get_monitor");
my $domain = get_domain($obj);
assert ($domain , "domain", "get_domain");
my $source = get_source($obj);
assert ($source , "machine1", "get_source");
my $submission_time = get_submissiontime($obj);
assert ($submission_time , "Thursday, 29 January 2009 08:45:05 GMT",
"get_submissiontime");
my $file_status = get_filestatus($obj);
assert ($file_status , "FLOWING", "get_filestatus");
}
# Test if the first two arguments are the same string
# Used in run_tests from testing
sub assert {
my $arg1 = shift;
my $arg2 = shift;
my $descr = shift;
if ($arg1 eq $arg2) {
print "PASSED\n";
return;
}
print $descr . "\tFAILED!!!\n";
return;
}
Hope it helps someone. It will surely help me the next time I try to do it. My Perl skills were all rusty.
joao 6:57 am on December 11, 2009 Permalink
Unfortunately I don’t know any. I would be interested in such groups but not only in Lisbon. Portugal has Hackers, Makers and Geeks also outside Lisbon, some of them are truly good ones and other truly bad ones, like in Lisbon :)
Can’t understand why we have so many Lisbon Groups and not Portugal Groups instead.
Pedro Sousa 9:46 am on December 11, 2009 Permalink
btw, It’s not Lisbon Ruby Group, it’s portuguese ruby group. we co-organized some dinners in Coimbra and Porto.
I think it’s just a mather of audience, if enough people organize stuff outside of Lisbon, the rest of the community will mostly follow (ex: takeoff, barcamp).
RicardoCastelhano 10:19 am on December 11, 2009 Permalink
Heyas Nuno,
tens também o grupo
XAMLPT – Comunidade Portuguesa para Tecnologias de Apresentação da Microsoft: WPF, Silverlight, Surface (http://www.xamlpt.com)
Bom trabalho com o blog !
Nuno Morgadinho 7:28 am on December 12, 2009 Permalink
Thanks guys for all the feedback.
There is also Ignite Portugal. Their next meeting will be in 17th of December, in Lisbon.
Jose 4:04 am on December 28, 2009 Permalink
There you go the Portuguese PHP User Group
http://groups.google.com/group/portugal-phpug
I believe that there are some more, but don’t know the addresses.