Introduction

We will use an example to demonstrate the fundamental steps in web-scrapping with R. We will first obtain a web-page, drill into its document structure and then extract data using a few html-tags. After that a few string manipulations will be done to clean the data. The resource section will point to more comprehensive material.

Preparations

There are several packages that may come in handy.

library(pacman)
p_load(rvest, tidyverse, stringr, rebus, lubridate, XML,xml2)

We won’t need all all of them for this tutorial, but it is sill good to have them ready.

First, obtain the url which contains the html contents and explore the source (e.g. Firefox >> right-click >> View Page source). In this tutorial we will extract information from the TSPLIB page.

url = 'http://elib.zib.de/pub/mp-testdata/tsp/tsplib/tsp/'

Get the document

First we need to download the document and find the entry point. The following command downloads and parses the html page and tries to fix malformed tags.

doc = htmlTreeParse(url)
doc
## $file
## [1] "http://elib.zib.de/pub/mp-testdata/tsp/tsplib/tsp/"
## 
## $version
## [1] ""
## 
## $children
## $children$html
## <html>
##  <head>
##   <title>MP-TESTDATA - The TSPLIB Symmetric Traveling Salesman Problem Instances</title>
##  </head>
##  <body>
##   <h4>
##    <center>MP-TESTDATA - The TSPLIB Symmetric Traveling Salesman Problem Instances</center>
##   </h4>
##   <hr/>
##   <ul>
##    <li>
##     <a href="a280.tsp">a280.tsp</a>
##     -
##    Drilling problem (Ludwig)
##     <p/>
##    </li>
##    <li>
##     <a href="a280.opt.tour">a280.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>a280</i>
##     <p/>
##    </li>
##    <li>
##     <a href="ali535.tsp">ali535.tsp</a>
##     -  
##    535 Airports around the globe (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="att48.tsp">att48.tsp</a>
##     - 
##    48 capitals of the US (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="att48.opt.tour">att48.opt.tour</a>
##     - 
##    Optimum solution for
##     <i>att48</i>
##     <p/>
##    </li>
##    <li>
##     <a href="att532.tsp">att532.tsp</a>
##     -  
##    532-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="bayg29.tsp">bayg29.tsp</a>
##     - 
##    29 Cities in Bavaria (geographical distance)
##     <p/>
##    </li>
##    <li>
##     <a href="bayg29.opt.tour">bayg29.opt.tour</a>
##     -   
##    Optimum solution of
##     <i>bayg29</i>
##     <p/>
##    </li>
##    <li>
##     <a href="bays29.tsp">bays29.tsp</a>
##     - 
##    29 Cities in Bavaria (street distance)
##     <p/>
##    </li>
##    <li>
##     <a href="bays29.opt.tour">bays29.opt.tour</a>
##     -
##    Optimum solution of
##     <i>bays29</i>
##     <p/>
##    </li>
##    <li>
##     <a href="berlin52.tsp">berlin52.tsp</a>
##     - 
##    52 locations in Berlin (Germany) (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="berlin52.opt.tour">berlin52.opt.tour</a>
##     -
##    Optimum tour for
##     <i>berlin52</i>
##     <p/>
##    </li>
##    <li>
##     <a href="bier127.tsp">bier127.tsp</a>
##     - 
##    127 beergardens in the Augsburg (Germany) area (Juenger/Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="brazil58.tsp">brazil58.tsp</a>
##     - 
##    58 cities in Brazil (C. Ferreira)
##     <p/>
##    </li>
##    <li>
##     <a href="brd14051.tsp">brd14051.tsp</a>
##     - 
##    Federal Republic of Germany with borders as of 1989 (Bachem/Wottawa)
##     <p/>
##    </li>
##    <li>
##     <a href="brg180.tsp">brg180.tsp</a>
##     - 
##    Bridge tournament problem (Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="brg180.opt.tour">brg180.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>brg180</i>
##     <p/>
##    </li>
##    <li>
##     <a href="burma14.tsp">burma14.tsp</a>
##     - 
##    14 cities in Burma (geographical coordinates)
##     <p/>
##    </li>
##    <li>
##     <a href="ch130.tsp">ch130.tsp</a>
##     - 
##    130 city problem (Churritz)
##     <p/>
##    </li>
##    <li>
##     <a href="ch130.opt.tour">ch130.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>ch130</i>
##     <p/>
##    </li>
##    <li>
##     <a href="ch150.tsp">ch150.tsp</a>
##     150 city problem (Churritz)
##     <p/>
##    </li>
##    <li>
##     <a href="ch150.opt.tour">ch150.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>ch150</i>
##     <p/>
##    </li>
##    <li>
##     <a href="d1291.tsp">d1291.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="d1655.tsp">d1655.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="d18512.tsp">d18512.tsp</a>
##     - 
##    Federal Republic of Germany (with ex-GDR territory) (Bachem/Wottawa)
##     <p/>
##    </li>
##    <li>
##     <a href="d198.tsp">d198.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="d2103.tsp">d2103.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="d493.tsp">d493.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="d657.tsp">d657.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="dantzig42.tsp">dantzig42.tsp</a>
##     - 
##    42 cities (Dantzig)
##     <p/>
##    </li>
##    <li>
##     <a href="dsj1000.tsp">dsj1000.tsp</a>
##     - 
##    Clustered random problem (Johnson)
##     <p/>
##    </li>
##    <li>
##     <a href="eil101.tsp">eil101.tsp</a>
##     -  
##    101-city problem (Christofides/Eilon)
##     <p/>
##    </li>
##    <li>
##     <a href="eil101.opt.tour">eil101.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>eil101</i>
##     <p/>
##    </li>
##    <li>
##     <a href="eil51.tsp">eil51.tsp</a>
##     - 
##    51-city problem (Christofides/Eilon)
##     <p/>
##    </li>
##    <li>
##     <a href="eil51.opt.tour">eil51.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>eil51</i>
##     <p/>
##    </li>
##    <li>
##     <a href="eil76.tsp">eil76.tsp</a>
##     - 
##    76-city problem (Christofides/Eilon)
##     <p/>
##    </li>
##    <li>
##     <a href="eil76.opt.tour">eil76.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>eil76</i>
##     <p/>
##    </li>
##    <li>
##     <a href="fl1400.tsp">fl1400.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="fl1577.tsp">fl1577.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="fl3795.tsp">fl3795.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="fl417.tsp">fl417.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="fnl4461.tsp">fnl4461.tsp</a>
##     - 
##    The five new Federal States of Germany (ex-GDR territory)
##    (Bachem/Wottawa)
##     <p/>
##    </li>
##    <li>
##     <a href="fri26.tsp">fri26.tsp</a>
##     - 
##    26-city problem (Fricker)
##     <p/>
##    </li>
##    <li>
##     <a href="fri26.opt.tour">fri26.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>fri26</i>
##     <p/>
##    </li>
##    <li>
##     <a href="gil262.tsp">gil262.ts</a>
##     - 
##    262-city problem (Gillet/Johnson)
##     <p/>
##    </li>
##    <li>
##     <a href="gr120.tsp">gr120.tsp</a>
##     - 
##    120 cities in Germany (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="gr120.opt.tour">gr120.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>gr120</i>
##     <p/>
##    </li>
##    <li>
##     <a href="gr137.tsp">gr137.tsp</a>
##     - 
##    America-Subproblem of 666-city TSP (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="gr17.tsp">gr17.tsp</a>
##     - 
##    17-city problem (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="gr202.tsp">gr202.tsp</a>
##     - 
##    Europe-Subproblem of 666-city TSP (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="gr202.opt.tour">gr202.opt.tour</a>
##     - 
##    Optimum solution for
##     <i>gr202</i>
##     <p/>
##    </li>
##    <li>
##     <a href="gr21.tsp">gr21.tsp</a>
##     - 
##    21-city problem (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="gr229.tsp">gr229.tsp</a>
##     - 
##    Asia/Australia-Subproblem of 666-city TSP (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="gr24.tsp">gr24.tsp</a>
##     - 
##    24-city problem (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="gr24.opt.tour">gr24.opt.tour</a>
##     - 
##    Optimum solution for
##     <i>gr24</i>
##     <p/>
##    </li>
##    <li>
##     <a href="gr431.tsp">gr431.tsp</a>
##     - 
##    Europe/Asia/Australia-Subproblem of 666-city TSP (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="gr48.tsp">gr48.tsp</a>
##     - 
##    48-city problem (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="gr48.opt.tour">gr48.opt.tour</a>
##     - 
##    Optimum solution for
##     <i>gr48</i>
##     <p/>
##    </li>
##    <li>
##     <a href="gr666.tsp">gr666.tsp</a>
##     - 
##    666 cities around the world (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="gr666.opt.tour">gr666.opt.tour</a>
##     - 
##    Optimum solution of
##     <i>gr666</i>
##     <p/>
##    </li>
##    <li>
##     <a href="gr96.tsp">gr96.tsp</a>
##     - 
##    Africa-Subproblem of 666-city TSP (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="gr96.opt.tour">gr96.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>gr96</i>
##     <p/>
##    </li>
##    <li>
##     <a href="hk48.tsp">hk48.tsp</a>
##     - 
##    48-city problem (Held/Karp)
##     <p/>
##    </li>
##    <li>
##     <a href="kroA100.tsp">kroa100.tsp</a>
##     - 
##    100-city problem A (Krolak/Felts/Nelson)
##     <p/>
##    </li>
##    <li>
##     <a href="kroA100.opt.tour">kroa100.opt.tour</a>
##     - 
##    Optimum for
##     <i>kroa100</i>
##     <p/>
##    </li>
##    <li>
##     <a href="kroA150.tsp">kroa150.tsp</a>
##     - 
##    150-city problem A (Krolak/Felts/Nelson)
##     <p/>
##    </li>
##    <li>
##     <a href="kroA200.tsp">kroa200.tsp</a>
##     - 
##    200-city problem A (Krolak/Felts/Nelson)
##     <p/>
##    </li>
##    <li>
##     <a href="kroB100.tsp">krob100.tsp</a>
##     - 
##    100-city problem B (Krolak/Felts/Nelson)
##     <p/>
##    </li>
##    <li>
##     <a href="kroB150.tsp">krob150.tsp</a>
##     - 
##    150-city problem B (Krolak/Felts/Nelson)
##     <p/>
##    </li>
##    <li>
##     <a href="kroB200.tsp">krob200.tsp</a>
##     - 
##    200-city problem B (Krolak/Felts/Nelson)
##     <p/>
##    </li>
##    <li>
##     <a href="kroC100.tsp">kroc100.tsp</a>
##     - 
##    100-city problem C (Krolak/Felts/Nelson)
##     <p/>
##    </li>
##    <li>
##     <a href="kroC100.opt.tour">kroc100.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>kroc100</i>
##     <p/>
##    </li>
##    <li>
##     <a href="kroD100.tsp">krod100.tsp</a>
##     - 
##    100-city problem D (Krolak/Felts/Nelson)
##     <p/>
##    </li>
##    <li>
##     <a href="kroD100.opt.tour">krod100.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>krod100</i>
##     <p/>
##    </li>
##    <li>
##     <a href="kroE100.tsp">kroe100.tsp</a>
##     - 
##    100-city problem E (Krolak/Felts/Nelson)
##     <p/>
##    </li>
##    <li>
##     <a href="lin105.tsp">lin105.tsp</a>
##     - 
##    105-city problem (Subproblem of lin318)
##     <p/>
##    </li>
##    <li>
##     <a href="lin105.opt.tour">lin105.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>lin105</i>
##     <p/>
##    </li>
##    <li>
##     <a href="lin318.tsp">lin318.tsp</a>
##     - 
##    318-city problem (Lin/Kernighan)
##     <p/>
##    </li>
##    <li>
##     <a href="linhp318.tsp">linhp318.tsp</a>
##     - 
##    Original 318-city problem (Lin/Kernighan)
##     <p/>
##    </li>
##    <li>
##     <a href="nrw1379.tsp">nrw1379.tsp</a>
##     - 
##    Nordrhein-Westfalen (Bachem/Wottawa)
##     <p/>
##    </li>
##    <li>
##     <a href="p654.tsp">p654.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="pa561.tsp">pa561.tsp</a>
##     - 
##    561-city problem (Kleinschmidt)
##     <p/>
##    </li>
##    <li>
##     <a href="pa561.opt.tour">pa561.opt.tour</a>
##     - 
##    Optimal.tour&quot;&gt;  -  for pa561
##     <p/>
##    </li>
##    <li>
##     <a href="pcb1173.tsp">pcb1173.tsp</a>
##     - 
##    Drilling problem (Juenger/Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="pcb3038.tsp">pcb3038.tsp</a>
##     - 
##    Drilling problem (Juenger/Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="pcb442.tsp">pcb442.tsp</a>
##     - 
##    Drilling problem (Groetschel/Juenger/Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="pcb442.opt.tour">pcb442.opt.tour</a>
##     - 
##    Optimum solution for
##     <i>pcb442</i>
##     <p/>
##    </li>
##    <li>
##     <a href="pla33810.tsp">pla33810.tsp</a>
##     - 
##    Programmed logic array (D.S.Johnson)
##     <p/>
##    </li>
##    <li>
##     <a href="pla7397.tsp">pla7397.tsp</a>
##     - 
##    Programmed logic array (D.S.Johnson)
##     <p/>
##    </li>
##    <li>
##     <a href="pla85900.tsp">pla85900.tsp</a>
##     - 
##    Programmed logic array (D.S.Johnson)
##     <p/>
##    </li>
##    <li>
##     <a href="pr1002.tsp">pr1002.tsp</a>
##     - 
##    1002-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr1002.opt.tour">pr1002.opt.tour</a>
##     - 
##    optimal.tour&quot;&gt;  -  for pr1002
##     <p/>
##    </li>
##    <li>
##     <a href="pr107.tsp">pr107.tsp</a>
##     - 
##    107-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr124.tsp">pr124.tsp</a>
##     - 
##    124-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr136.tsp">pr136.tsp</a>
##     - 
##    136-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr144.tsp">pr144.tsp</a>
##     - 
##    144-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr152.tsp">pr152.tsp</a>
##     - 
##    152-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr226.tsp">pr226.tsp</a>
##     - 
##    226-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr2392.tsp">pr2392.tsp</a>
##     - 
##    2392-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr2392.opt.tour">pr2392.opt.tour</a>
##     - 
##    Optimum solution for
##     <i>pr2392</i>
##     <p/>
##    </li>
##    <li>
##     <a href="pr264.tsp">pr264.tsp</a>
##     - 
##    264-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr299.tsp">pr299.tsp</a>
##     - 
##    299-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr439.tsp">pr439.tsp</a>
##     - 
##    439-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr76.tsp">pr76.tsp</a>
##     - 
##    76-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr76.opt.tour">pr76.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>pr76</i>
##     <p/>
##    </li>
##    <li>
##     <a href="rat195.tsp">rat195.tsp</a>
##     - 
##    Rattled grid (Pulleyblank)
##     <p/>
##    </li>
##    <li>
##     <a href="rat575.tsp">rat575.tsp</a>
##     - 
##    Rattled grid (Pulleyblank)
##     <p/>
##    </li>
##    <li>
##     <a href="rat783.tsp">rat783.tsp</a>
##     - 
##    Rattled grid (Pulleyblank)
##     <p/>
##    </li>
##    <li>
##     <a href="rat99.tsp">rat99.tsp</a>
##     - 
##    Rattled grid (Pulleyblank)
##     <p/>
##    </li>
##    <li>
##     <a href="rd100.tsp">rd100.tsp</a>
##     - 
##    100-city random TSP (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="rd100.opt.tour">rd100.opt.tour</a>
##     - 
##    Optimum solution for
##     <i>rd100</i>
##     <p/>
##    </li>
##    <li>
##     <a href="rd400.tsp">rd400.tsp</a>
##     - 
##    400-city random TSP (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="rl11849.tsp">rl11849.tsp</a>
##     - 
##    11849-city TSP (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="rl1304.tsp">rl1304.tsp</a>
##     - 
##    1304-city TSP (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="rl1323.tsp">rl1323.tsp</a>
##     - 
##    1323-city TSP (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="rl1889.tsp">rl1889.tsp</a>
##     - 
##    1889-city TSP (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="rl5915.tsp">rl5915.tsp</a>
##     - 
##    5915-city TSP (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="rl5934.tsp">rl5934.tsp</a>
##     - 
##    5934-city TSP (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="si1032.tsp">si1032.tsp</a>
##     - 
##    1032-vertex TSP (M. Hofmeister)
##     <p/>
##    </li>
##    <li>
##     <a href="si175.tsp">si175.tsp</a>
##     - 
##    175-vertex TSP (M. Hofmeister)
##     <p/>
##    </li>
##    <li>
##     <a href="si535.tsp">si535.tsp</a>
##     - 
##    535-vertex TSP (M. Hofmeister)
##     <p/>
##    </li>
##    <li>
##     <a href="st70.tsp">st70.tsp</a>
##     - 
##    70-city problem (Smith/Thompson)
##     <p/>
##    </li>
##    <li>
##     <a href="st70.opt.tour">st70.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>st70</i>
##     <p/>
##    </li>
##    <li>
##     <a href="swiss42.tsp">swiss42.tsp</a>
##     - 
##    42 cities Switzerland (Fricker)
##     <p/>
##    </li>
##    <li>
##     <a href="ts225.tsp">ts225.tsp</a>
##     - 
##    225-city problem (Juenger,Raecke,Tschoecke)
##     <p/>
##    </li>
##    <li>
##     <a href="tsp225.tsp">tsp225.tsp</a>
##     -
##    A TSP problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href=".tsp&quot;">225.opt.tour&quot;&gt;</a>
##     - 
##    Optimal solution for
##     <i>tsp225.tsp</i>
##     <p/>
##    </li>
##    <li>
##     <a href="u1060.tsp">u1060.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="u1432.tsp">u1432.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="u159.tsp">u159.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="u1817.tsp">u1817.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="u2152.tsp">u2152.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="u2319.tsp">u2319.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="u574.tsp">u574.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="u724.tsp">u724.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="ulysses16.tsp">ulysses16.tsp</a>
##     - 
##    Odyssey of Ulysses (Groetschel and Padberg)
##     <p/>
##    </li>
##    <li>
##     <a href="ulysses16.opt.tour">ulysses16.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>ulysses16</i>
##     <p/>
##    </li>
##    <li>
##     <a href="ulysses22.tsp">ulysses22.tsp</a>
##     - 
##    Odyssey of Ulysses (Groetschel and Padberg)
##     <p/>
##    </li>
##    <li>
##     <a href="ulysses22.opt.tour">ulysses22.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>ulysses22</i>
##     <p/>
##    </li>
##    <li>
##     <a href="usa13509.tsp">usa13509.tsp</a>
##     - 
##    Cities with population at least 500 in the continental US (David
##    Applegate and Andre Rohe)
##     <p/>
##    </li>
##    <li>
##     <a href="vm1084.tsp">vm1084.tsp</a>
##     - 
##    1084-city problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="vm1748.tsp">vm1748.tsp</a>
##     - 
##    1784-city problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="xray.problems">xray.problems</a>
##     -
##    Xray crystallography TSP (F. Nielsen, D. Shallcross) 
##    (source code for generator)
##    </li>
##   </ul>
##   <hr/>
##   <table align="LEFT" cellspacing="0" width="100%">
##    <caption/>
##    <tr align="LEFT">
##     <td align="LEFT">
##      <em>Last update: June 1, 1995</em>
##     </td>
##     <td align="LEFT">
##      <a target="_top" href="http://www.zib.de/adm/uaf.srv?-e+skorobohatyj">
##       <em>Georg Skorobohatyj</em>
##      </a>
##     </td>
##     <td align="RIGHT">
##      <a target="_top" href="http://www.zib.de/">
##       <em>ZIB Homepage</em>
##      </a>
##     </td>
##    </tr>
##   </table>
##   <br/>
##   <br/>
##   <font size="-1">URL: ftp://ftp.zib.de/pub/Packages/mp-testdata/tsplib/tsp/index.html</font>
##  </body>
## </html>
## 
## 
## attr(,"class")
## [1] "XMLDocumentContent"

Identify the root - our entry point - of the parsed tree.

root = xmlRoot(doc)
root
## <html>
##  <head>
##   <title>MP-TESTDATA - The TSPLIB Symmetric Traveling Salesman Problem Instances</title>
##  </head>
##  <body>
##   <h4>
##    <center>MP-TESTDATA - The TSPLIB Symmetric Traveling Salesman Problem Instances</center>
##   </h4>
##   <hr/>
##   <ul>
##    <li>
##     <a href="a280.tsp">a280.tsp</a>
##     -
##    Drilling problem (Ludwig)
##     <p/>
##    </li>
##    <li>
##     <a href="a280.opt.tour">a280.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>a280</i>
##     <p/>
##    </li>
##    <li>
##     <a href="ali535.tsp">ali535.tsp</a>
##     -  
##    535 Airports around the globe (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="att48.tsp">att48.tsp</a>
##     - 
##    48 capitals of the US (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="att48.opt.tour">att48.opt.tour</a>
##     - 
##    Optimum solution for
##     <i>att48</i>
##     <p/>
##    </li>
##    <li>
##     <a href="att532.tsp">att532.tsp</a>
##     -  
##    532-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="bayg29.tsp">bayg29.tsp</a>
##     - 
##    29 Cities in Bavaria (geographical distance)
##     <p/>
##    </li>
##    <li>
##     <a href="bayg29.opt.tour">bayg29.opt.tour</a>
##     -   
##    Optimum solution of
##     <i>bayg29</i>
##     <p/>
##    </li>
##    <li>
##     <a href="bays29.tsp">bays29.tsp</a>
##     - 
##    29 Cities in Bavaria (street distance)
##     <p/>
##    </li>
##    <li>
##     <a href="bays29.opt.tour">bays29.opt.tour</a>
##     -
##    Optimum solution of
##     <i>bays29</i>
##     <p/>
##    </li>
##    <li>
##     <a href="berlin52.tsp">berlin52.tsp</a>
##     - 
##    52 locations in Berlin (Germany) (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="berlin52.opt.tour">berlin52.opt.tour</a>
##     -
##    Optimum tour for
##     <i>berlin52</i>
##     <p/>
##    </li>
##    <li>
##     <a href="bier127.tsp">bier127.tsp</a>
##     - 
##    127 beergardens in the Augsburg (Germany) area (Juenger/Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="brazil58.tsp">brazil58.tsp</a>
##     - 
##    58 cities in Brazil (C. Ferreira)
##     <p/>
##    </li>
##    <li>
##     <a href="brd14051.tsp">brd14051.tsp</a>
##     - 
##    Federal Republic of Germany with borders as of 1989 (Bachem/Wottawa)
##     <p/>
##    </li>
##    <li>
##     <a href="brg180.tsp">brg180.tsp</a>
##     - 
##    Bridge tournament problem (Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="brg180.opt.tour">brg180.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>brg180</i>
##     <p/>
##    </li>
##    <li>
##     <a href="burma14.tsp">burma14.tsp</a>
##     - 
##    14 cities in Burma (geographical coordinates)
##     <p/>
##    </li>
##    <li>
##     <a href="ch130.tsp">ch130.tsp</a>
##     - 
##    130 city problem (Churritz)
##     <p/>
##    </li>
##    <li>
##     <a href="ch130.opt.tour">ch130.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>ch130</i>
##     <p/>
##    </li>
##    <li>
##     <a href="ch150.tsp">ch150.tsp</a>
##     150 city problem (Churritz)
##     <p/>
##    </li>
##    <li>
##     <a href="ch150.opt.tour">ch150.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>ch150</i>
##     <p/>
##    </li>
##    <li>
##     <a href="d1291.tsp">d1291.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="d1655.tsp">d1655.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="d18512.tsp">d18512.tsp</a>
##     - 
##    Federal Republic of Germany (with ex-GDR territory) (Bachem/Wottawa)
##     <p/>
##    </li>
##    <li>
##     <a href="d198.tsp">d198.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="d2103.tsp">d2103.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="d493.tsp">d493.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="d657.tsp">d657.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="dantzig42.tsp">dantzig42.tsp</a>
##     - 
##    42 cities (Dantzig)
##     <p/>
##    </li>
##    <li>
##     <a href="dsj1000.tsp">dsj1000.tsp</a>
##     - 
##    Clustered random problem (Johnson)
##     <p/>
##    </li>
##    <li>
##     <a href="eil101.tsp">eil101.tsp</a>
##     -  
##    101-city problem (Christofides/Eilon)
##     <p/>
##    </li>
##    <li>
##     <a href="eil101.opt.tour">eil101.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>eil101</i>
##     <p/>
##    </li>
##    <li>
##     <a href="eil51.tsp">eil51.tsp</a>
##     - 
##    51-city problem (Christofides/Eilon)
##     <p/>
##    </li>
##    <li>
##     <a href="eil51.opt.tour">eil51.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>eil51</i>
##     <p/>
##    </li>
##    <li>
##     <a href="eil76.tsp">eil76.tsp</a>
##     - 
##    76-city problem (Christofides/Eilon)
##     <p/>
##    </li>
##    <li>
##     <a href="eil76.opt.tour">eil76.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>eil76</i>
##     <p/>
##    </li>
##    <li>
##     <a href="fl1400.tsp">fl1400.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="fl1577.tsp">fl1577.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="fl3795.tsp">fl3795.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="fl417.tsp">fl417.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="fnl4461.tsp">fnl4461.tsp</a>
##     - 
##    The five new Federal States of Germany (ex-GDR territory)
##    (Bachem/Wottawa)
##     <p/>
##    </li>
##    <li>
##     <a href="fri26.tsp">fri26.tsp</a>
##     - 
##    26-city problem (Fricker)
##     <p/>
##    </li>
##    <li>
##     <a href="fri26.opt.tour">fri26.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>fri26</i>
##     <p/>
##    </li>
##    <li>
##     <a href="gil262.tsp">gil262.ts</a>
##     - 
##    262-city problem (Gillet/Johnson)
##     <p/>
##    </li>
##    <li>
##     <a href="gr120.tsp">gr120.tsp</a>
##     - 
##    120 cities in Germany (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="gr120.opt.tour">gr120.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>gr120</i>
##     <p/>
##    </li>
##    <li>
##     <a href="gr137.tsp">gr137.tsp</a>
##     - 
##    America-Subproblem of 666-city TSP (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="gr17.tsp">gr17.tsp</a>
##     - 
##    17-city problem (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="gr202.tsp">gr202.tsp</a>
##     - 
##    Europe-Subproblem of 666-city TSP (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="gr202.opt.tour">gr202.opt.tour</a>
##     - 
##    Optimum solution for
##     <i>gr202</i>
##     <p/>
##    </li>
##    <li>
##     <a href="gr21.tsp">gr21.tsp</a>
##     - 
##    21-city problem (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="gr229.tsp">gr229.tsp</a>
##     - 
##    Asia/Australia-Subproblem of 666-city TSP (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="gr24.tsp">gr24.tsp</a>
##     - 
##    24-city problem (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="gr24.opt.tour">gr24.opt.tour</a>
##     - 
##    Optimum solution for
##     <i>gr24</i>
##     <p/>
##    </li>
##    <li>
##     <a href="gr431.tsp">gr431.tsp</a>
##     - 
##    Europe/Asia/Australia-Subproblem of 666-city TSP (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="gr48.tsp">gr48.tsp</a>
##     - 
##    48-city problem (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="gr48.opt.tour">gr48.opt.tour</a>
##     - 
##    Optimum solution for
##     <i>gr48</i>
##     <p/>
##    </li>
##    <li>
##     <a href="gr666.tsp">gr666.tsp</a>
##     - 
##    666 cities around the world (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="gr666.opt.tour">gr666.opt.tour</a>
##     - 
##    Optimum solution of
##     <i>gr666</i>
##     <p/>
##    </li>
##    <li>
##     <a href="gr96.tsp">gr96.tsp</a>
##     - 
##    Africa-Subproblem of 666-city TSP (Groetschel)
##     <p/>
##    </li>
##    <li>
##     <a href="gr96.opt.tour">gr96.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>gr96</i>
##     <p/>
##    </li>
##    <li>
##     <a href="hk48.tsp">hk48.tsp</a>
##     - 
##    48-city problem (Held/Karp)
##     <p/>
##    </li>
##    <li>
##     <a href="kroA100.tsp">kroa100.tsp</a>
##     - 
##    100-city problem A (Krolak/Felts/Nelson)
##     <p/>
##    </li>
##    <li>
##     <a href="kroA100.opt.tour">kroa100.opt.tour</a>
##     - 
##    Optimum for
##     <i>kroa100</i>
##     <p/>
##    </li>
##    <li>
##     <a href="kroA150.tsp">kroa150.tsp</a>
##     - 
##    150-city problem A (Krolak/Felts/Nelson)
##     <p/>
##    </li>
##    <li>
##     <a href="kroA200.tsp">kroa200.tsp</a>
##     - 
##    200-city problem A (Krolak/Felts/Nelson)
##     <p/>
##    </li>
##    <li>
##     <a href="kroB100.tsp">krob100.tsp</a>
##     - 
##    100-city problem B (Krolak/Felts/Nelson)
##     <p/>
##    </li>
##    <li>
##     <a href="kroB150.tsp">krob150.tsp</a>
##     - 
##    150-city problem B (Krolak/Felts/Nelson)
##     <p/>
##    </li>
##    <li>
##     <a href="kroB200.tsp">krob200.tsp</a>
##     - 
##    200-city problem B (Krolak/Felts/Nelson)
##     <p/>
##    </li>
##    <li>
##     <a href="kroC100.tsp">kroc100.tsp</a>
##     - 
##    100-city problem C (Krolak/Felts/Nelson)
##     <p/>
##    </li>
##    <li>
##     <a href="kroC100.opt.tour">kroc100.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>kroc100</i>
##     <p/>
##    </li>
##    <li>
##     <a href="kroD100.tsp">krod100.tsp</a>
##     - 
##    100-city problem D (Krolak/Felts/Nelson)
##     <p/>
##    </li>
##    <li>
##     <a href="kroD100.opt.tour">krod100.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>krod100</i>
##     <p/>
##    </li>
##    <li>
##     <a href="kroE100.tsp">kroe100.tsp</a>
##     - 
##    100-city problem E (Krolak/Felts/Nelson)
##     <p/>
##    </li>
##    <li>
##     <a href="lin105.tsp">lin105.tsp</a>
##     - 
##    105-city problem (Subproblem of lin318)
##     <p/>
##    </li>
##    <li>
##     <a href="lin105.opt.tour">lin105.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>lin105</i>
##     <p/>
##    </li>
##    <li>
##     <a href="lin318.tsp">lin318.tsp</a>
##     - 
##    318-city problem (Lin/Kernighan)
##     <p/>
##    </li>
##    <li>
##     <a href="linhp318.tsp">linhp318.tsp</a>
##     - 
##    Original 318-city problem (Lin/Kernighan)
##     <p/>
##    </li>
##    <li>
##     <a href="nrw1379.tsp">nrw1379.tsp</a>
##     - 
##    Nordrhein-Westfalen (Bachem/Wottawa)
##     <p/>
##    </li>
##    <li>
##     <a href="p654.tsp">p654.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="pa561.tsp">pa561.tsp</a>
##     - 
##    561-city problem (Kleinschmidt)
##     <p/>
##    </li>
##    <li>
##     <a href="pa561.opt.tour">pa561.opt.tour</a>
##     - 
##    Optimal.tour&quot;&gt;  -  for pa561
##     <p/>
##    </li>
##    <li>
##     <a href="pcb1173.tsp">pcb1173.tsp</a>
##     - 
##    Drilling problem (Juenger/Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="pcb3038.tsp">pcb3038.tsp</a>
##     - 
##    Drilling problem (Juenger/Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="pcb442.tsp">pcb442.tsp</a>
##     - 
##    Drilling problem (Groetschel/Juenger/Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="pcb442.opt.tour">pcb442.opt.tour</a>
##     - 
##    Optimum solution for
##     <i>pcb442</i>
##     <p/>
##    </li>
##    <li>
##     <a href="pla33810.tsp">pla33810.tsp</a>
##     - 
##    Programmed logic array (D.S.Johnson)
##     <p/>
##    </li>
##    <li>
##     <a href="pla7397.tsp">pla7397.tsp</a>
##     - 
##    Programmed logic array (D.S.Johnson)
##     <p/>
##    </li>
##    <li>
##     <a href="pla85900.tsp">pla85900.tsp</a>
##     - 
##    Programmed logic array (D.S.Johnson)
##     <p/>
##    </li>
##    <li>
##     <a href="pr1002.tsp">pr1002.tsp</a>
##     - 
##    1002-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr1002.opt.tour">pr1002.opt.tour</a>
##     - 
##    optimal.tour&quot;&gt;  -  for pr1002
##     <p/>
##    </li>
##    <li>
##     <a href="pr107.tsp">pr107.tsp</a>
##     - 
##    107-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr124.tsp">pr124.tsp</a>
##     - 
##    124-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr136.tsp">pr136.tsp</a>
##     - 
##    136-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr144.tsp">pr144.tsp</a>
##     - 
##    144-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr152.tsp">pr152.tsp</a>
##     - 
##    152-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr226.tsp">pr226.tsp</a>
##     - 
##    226-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr2392.tsp">pr2392.tsp</a>
##     - 
##    2392-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr2392.opt.tour">pr2392.opt.tour</a>
##     - 
##    Optimum solution for
##     <i>pr2392</i>
##     <p/>
##    </li>
##    <li>
##     <a href="pr264.tsp">pr264.tsp</a>
##     - 
##    264-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr299.tsp">pr299.tsp</a>
##     - 
##    299-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr439.tsp">pr439.tsp</a>
##     - 
##    439-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr76.tsp">pr76.tsp</a>
##     - 
##    76-city problem (Padberg/Rinaldi)
##     <p/>
##    </li>
##    <li>
##     <a href="pr76.opt.tour">pr76.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>pr76</i>
##     <p/>
##    </li>
##    <li>
##     <a href="rat195.tsp">rat195.tsp</a>
##     - 
##    Rattled grid (Pulleyblank)
##     <p/>
##    </li>
##    <li>
##     <a href="rat575.tsp">rat575.tsp</a>
##     - 
##    Rattled grid (Pulleyblank)
##     <p/>
##    </li>
##    <li>
##     <a href="rat783.tsp">rat783.tsp</a>
##     - 
##    Rattled grid (Pulleyblank)
##     <p/>
##    </li>
##    <li>
##     <a href="rat99.tsp">rat99.tsp</a>
##     - 
##    Rattled grid (Pulleyblank)
##     <p/>
##    </li>
##    <li>
##     <a href="rd100.tsp">rd100.tsp</a>
##     - 
##    100-city random TSP (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="rd100.opt.tour">rd100.opt.tour</a>
##     - 
##    Optimum solution for
##     <i>rd100</i>
##     <p/>
##    </li>
##    <li>
##     <a href="rd400.tsp">rd400.tsp</a>
##     - 
##    400-city random TSP (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="rl11849.tsp">rl11849.tsp</a>
##     - 
##    11849-city TSP (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="rl1304.tsp">rl1304.tsp</a>
##     - 
##    1304-city TSP (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="rl1323.tsp">rl1323.tsp</a>
##     - 
##    1323-city TSP (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="rl1889.tsp">rl1889.tsp</a>
##     - 
##    1889-city TSP (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="rl5915.tsp">rl5915.tsp</a>
##     - 
##    5915-city TSP (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="rl5934.tsp">rl5934.tsp</a>
##     - 
##    5934-city TSP (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="si1032.tsp">si1032.tsp</a>
##     - 
##    1032-vertex TSP (M. Hofmeister)
##     <p/>
##    </li>
##    <li>
##     <a href="si175.tsp">si175.tsp</a>
##     - 
##    175-vertex TSP (M. Hofmeister)
##     <p/>
##    </li>
##    <li>
##     <a href="si535.tsp">si535.tsp</a>
##     - 
##    535-vertex TSP (M. Hofmeister)
##     <p/>
##    </li>
##    <li>
##     <a href="st70.tsp">st70.tsp</a>
##     - 
##    70-city problem (Smith/Thompson)
##     <p/>
##    </li>
##    <li>
##     <a href="st70.opt.tour">st70.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>st70</i>
##     <p/>
##    </li>
##    <li>
##     <a href="swiss42.tsp">swiss42.tsp</a>
##     - 
##    42 cities Switzerland (Fricker)
##     <p/>
##    </li>
##    <li>
##     <a href="ts225.tsp">ts225.tsp</a>
##     - 
##    225-city problem (Juenger,Raecke,Tschoecke)
##     <p/>
##    </li>
##    <li>
##     <a href="tsp225.tsp">tsp225.tsp</a>
##     -
##    A TSP problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href=".tsp&quot;">225.opt.tour&quot;&gt;</a>
##     - 
##    Optimal solution for
##     <i>tsp225.tsp</i>
##     <p/>
##    </li>
##    <li>
##     <a href="u1060.tsp">u1060.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="u1432.tsp">u1432.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="u159.tsp">u159.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="u1817.tsp">u1817.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="u2152.tsp">u2152.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="u2319.tsp">u2319.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="u574.tsp">u574.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="u724.tsp">u724.tsp</a>
##     - 
##    Drilling problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="ulysses16.tsp">ulysses16.tsp</a>
##     - 
##    Odyssey of Ulysses (Groetschel and Padberg)
##     <p/>
##    </li>
##    <li>
##     <a href="ulysses16.opt.tour">ulysses16.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>ulysses16</i>
##     <p/>
##    </li>
##    <li>
##     <a href="ulysses22.tsp">ulysses22.tsp</a>
##     - 
##    Odyssey of Ulysses (Groetschel and Padberg)
##     <p/>
##    </li>
##    <li>
##     <a href="ulysses22.opt.tour">ulysses22.opt.tour</a>
##     - 
##    Optimum tour for
##     <i>ulysses22</i>
##     <p/>
##    </li>
##    <li>
##     <a href="usa13509.tsp">usa13509.tsp</a>
##     - 
##    Cities with population at least 500 in the continental US (David
##    Applegate and Andre Rohe)
##     <p/>
##    </li>
##    <li>
##     <a href="vm1084.tsp">vm1084.tsp</a>
##     - 
##    1084-city problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="vm1748.tsp">vm1748.tsp</a>
##     - 
##    1784-city problem (Reinelt)
##     <p/>
##    </li>
##    <li>
##     <a href="xray.problems">xray.problems</a>
##     -
##    Xray crystallography TSP (F. Nielsen, D. Shallcross) 
##    (source code for generator)
##    </li>
##   </ul>
##   <hr/>
##   <table align="LEFT" cellspacing="0" width="100%">
##    <caption/>
##    <tr align="LEFT">
##     <td align="LEFT">
##      <em>Last update: June 1, 1995</em>
##     </td>
##     <td align="LEFT">
##      <a target="_top" href="http://www.zib.de/adm/uaf.srv?-e+skorobohatyj">
##       <em>Georg Skorobohatyj</em>
##      </a>
##     </td>
##     <td align="RIGHT">
##      <a target="_top" href="http://www.zib.de/">
##       <em>ZIB Homepage</em>
##      </a>
##     </td>
##    </tr>
##   </table>
##   <br/>
##   <br/>
##   <font size="-1">URL: ftp://ftp.zib.de/pub/Packages/mp-testdata/tsplib/tsp/index.html</font>
##  </body>
## </html>

Drilling into the document

We can systematically explore the structure by obtaining a node’s children.

ds   = xmlChildren(root)    # document structure
ds
## $head
## <head>
##  <title>MP-TESTDATA - The TSPLIB Symmetric Traveling Salesman Problem Instances</title>
## </head>
## 
## $body
## <body>
##  <h4>
##   <center>MP-TESTDATA - The TSPLIB Symmetric Traveling Salesman Problem Instances</center>
##  </h4>
##  <hr/>
##  <ul>
##   <li>
##    <a href="a280.tsp">a280.tsp</a>
##    -
##    Drilling problem (Ludwig)
##    <p/>
##   </li>
##   <li>
##    <a href="a280.opt.tour">a280.opt.tour</a>
##    - 
##    Optimum tour for
##    <i>a280</i>
##    <p/>
##   </li>
##   <li>
##    <a href="ali535.tsp">ali535.tsp</a>
##    -  
##    535 Airports around the globe (Padberg/Rinaldi)
##    <p/>
##   </li>
##   <li>
##    <a href="att48.tsp">att48.tsp</a>
##    - 
##    48 capitals of the US (Padberg/Rinaldi)
##    <p/>
##   </li>
##   <li>
##    <a href="att48.opt.tour">att48.opt.tour</a>
##    - 
##    Optimum solution for
##    <i>att48</i>
##    <p/>
##   </li>
##   <li>
##    <a href="att532.tsp">att532.tsp</a>
##    -  
##    532-city problem (Padberg/Rinaldi)
##    <p/>
##   </li>
##   <li>
##    <a href="bayg29.tsp">bayg29.tsp</a>
##    - 
##    29 Cities in Bavaria (geographical distance)
##    <p/>
##   </li>
##   <li>
##    <a href="bayg29.opt.tour">bayg29.opt.tour</a>
##    -   
##    Optimum solution of
##    <i>bayg29</i>
##    <p/>
##   </li>
##   <li>
##    <a href="bays29.tsp">bays29.tsp</a>
##    - 
##    29 Cities in Bavaria (street distance)
##    <p/>
##   </li>
##   <li>
##    <a href="bays29.opt.tour">bays29.opt.tour</a>
##    -
##    Optimum solution of
##    <i>bays29</i>
##    <p/>
##   </li>
##   <li>
##    <a href="berlin52.tsp">berlin52.tsp</a>
##    - 
##    52 locations in Berlin (Germany) (Groetschel)
##    <p/>
##   </li>
##   <li>
##    <a href="berlin52.opt.tour">berlin52.opt.tour</a>
##    -
##    Optimum tour for
##    <i>berlin52</i>
##    <p/>
##   </li>
##   <li>
##    <a href="bier127.tsp">bier127.tsp</a>
##    - 
##    127 beergardens in the Augsburg (Germany) area (Juenger/Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="brazil58.tsp">brazil58.tsp</a>
##    - 
##    58 cities in Brazil (C. Ferreira)
##    <p/>
##   </li>
##   <li>
##    <a href="brd14051.tsp">brd14051.tsp</a>
##    - 
##    Federal Republic of Germany with borders as of 1989 (Bachem/Wottawa)
##    <p/>
##   </li>
##   <li>
##    <a href="brg180.tsp">brg180.tsp</a>
##    - 
##    Bridge tournament problem (Rinaldi)
##    <p/>
##   </li>
##   <li>
##    <a href="brg180.opt.tour">brg180.opt.tour</a>
##    - 
##    Optimum tour for
##    <i>brg180</i>
##    <p/>
##   </li>
##   <li>
##    <a href="burma14.tsp">burma14.tsp</a>
##    - 
##    14 cities in Burma (geographical coordinates)
##    <p/>
##   </li>
##   <li>
##    <a href="ch130.tsp">ch130.tsp</a>
##    - 
##    130 city problem (Churritz)
##    <p/>
##   </li>
##   <li>
##    <a href="ch130.opt.tour">ch130.opt.tour</a>
##    - 
##    Optimum tour for
##    <i>ch130</i>
##    <p/>
##   </li>
##   <li>
##    <a href="ch150.tsp">ch150.tsp</a>
##    150 city problem (Churritz)
##    <p/>
##   </li>
##   <li>
##    <a href="ch150.opt.tour">ch150.opt.tour</a>
##    - 
##    Optimum tour for
##    <i>ch150</i>
##    <p/>
##   </li>
##   <li>
##    <a href="d1291.tsp">d1291.tsp</a>
##    - 
##    Drilling problem (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="d1655.tsp">d1655.tsp</a>
##    - 
##    Drilling problem (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="d18512.tsp">d18512.tsp</a>
##    - 
##    Federal Republic of Germany (with ex-GDR territory) (Bachem/Wottawa)
##    <p/>
##   </li>
##   <li>
##    <a href="d198.tsp">d198.tsp</a>
##    - 
##    Drilling problem (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="d2103.tsp">d2103.tsp</a>
##    - 
##    Drilling problem (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="d493.tsp">d493.tsp</a>
##    - 
##    Drilling problem (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="d657.tsp">d657.tsp</a>
##    - 
##    Drilling problem (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="dantzig42.tsp">dantzig42.tsp</a>
##    - 
##    42 cities (Dantzig)
##    <p/>
##   </li>
##   <li>
##    <a href="dsj1000.tsp">dsj1000.tsp</a>
##    - 
##    Clustered random problem (Johnson)
##    <p/>
##   </li>
##   <li>
##    <a href="eil101.tsp">eil101.tsp</a>
##    -  
##    101-city problem (Christofides/Eilon)
##    <p/>
##   </li>
##   <li>
##    <a href="eil101.opt.tour">eil101.opt.tour</a>
##    - 
##    Optimum tour for
##    <i>eil101</i>
##    <p/>
##   </li>
##   <li>
##    <a href="eil51.tsp">eil51.tsp</a>
##    - 
##    51-city problem (Christofides/Eilon)
##    <p/>
##   </li>
##   <li>
##    <a href="eil51.opt.tour">eil51.opt.tour</a>
##    - 
##    Optimum tour for
##    <i>eil51</i>
##    <p/>
##   </li>
##   <li>
##    <a href="eil76.tsp">eil76.tsp</a>
##    - 
##    76-city problem (Christofides/Eilon)
##    <p/>
##   </li>
##   <li>
##    <a href="eil76.opt.tour">eil76.opt.tour</a>
##    - 
##    Optimum tour for
##    <i>eil76</i>
##    <p/>
##   </li>
##   <li>
##    <a href="fl1400.tsp">fl1400.tsp</a>
##    - 
##    Drilling problem (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="fl1577.tsp">fl1577.tsp</a>
##    - 
##    Drilling problem (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="fl3795.tsp">fl3795.tsp</a>
##    - 
##    Drilling problem (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="fl417.tsp">fl417.tsp</a>
##    - 
##    Drilling problem (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="fnl4461.tsp">fnl4461.tsp</a>
##    - 
##    The five new Federal States of Germany (ex-GDR territory)
##    (Bachem/Wottawa)
##    <p/>
##   </li>
##   <li>
##    <a href="fri26.tsp">fri26.tsp</a>
##    - 
##    26-city problem (Fricker)
##    <p/>
##   </li>
##   <li>
##    <a href="fri26.opt.tour">fri26.opt.tour</a>
##    - 
##    Optimum tour for
##    <i>fri26</i>
##    <p/>
##   </li>
##   <li>
##    <a href="gil262.tsp">gil262.ts</a>
##    - 
##    262-city problem (Gillet/Johnson)
##    <p/>
##   </li>
##   <li>
##    <a href="gr120.tsp">gr120.tsp</a>
##    - 
##    120 cities in Germany (Groetschel)
##    <p/>
##   </li>
##   <li>
##    <a href="gr120.opt.tour">gr120.opt.tour</a>
##    - 
##    Optimum tour for
##    <i>gr120</i>
##    <p/>
##   </li>
##   <li>
##    <a href="gr137.tsp">gr137.tsp</a>
##    - 
##    America-Subproblem of 666-city TSP (Groetschel)
##    <p/>
##   </li>
##   <li>
##    <a href="gr17.tsp">gr17.tsp</a>
##    - 
##    17-city problem (Groetschel)
##    <p/>
##   </li>
##   <li>
##    <a href="gr202.tsp">gr202.tsp</a>
##    - 
##    Europe-Subproblem of 666-city TSP (Groetschel)
##    <p/>
##   </li>
##   <li>
##    <a href="gr202.opt.tour">gr202.opt.tour</a>
##    - 
##    Optimum solution for
##    <i>gr202</i>
##    <p/>
##   </li>
##   <li>
##    <a href="gr21.tsp">gr21.tsp</a>
##    - 
##    21-city problem (Groetschel)
##    <p/>
##   </li>
##   <li>
##    <a href="gr229.tsp">gr229.tsp</a>
##    - 
##    Asia/Australia-Subproblem of 666-city TSP (Groetschel)
##    <p/>
##   </li>
##   <li>
##    <a href="gr24.tsp">gr24.tsp</a>
##    - 
##    24-city problem (Groetschel)
##    <p/>
##   </li>
##   <li>
##    <a href="gr24.opt.tour">gr24.opt.tour</a>
##    - 
##    Optimum solution for
##    <i>gr24</i>
##    <p/>
##   </li>
##   <li>
##    <a href="gr431.tsp">gr431.tsp</a>
##    - 
##    Europe/Asia/Australia-Subproblem of 666-city TSP (Groetschel)
##    <p/>
##   </li>
##   <li>
##    <a href="gr48.tsp">gr48.tsp</a>
##    - 
##    48-city problem (Groetschel)
##    <p/>
##   </li>
##   <li>
##    <a href="gr48.opt.tour">gr48.opt.tour</a>
##    - 
##    Optimum solution for
##    <i>gr48</i>
##    <p/>
##   </li>
##   <li>
##    <a href="gr666.tsp">gr666.tsp</a>
##    - 
##    666 cities around the world (Groetschel)
##    <p/>
##   </li>
##   <li>
##    <a href="gr666.opt.tour">gr666.opt.tour</a>
##    - 
##    Optimum solution of
##    <i>gr666</i>
##    <p/>
##   </li>
##   <li>
##    <a href="gr96.tsp">gr96.tsp</a>
##    - 
##    Africa-Subproblem of 666-city TSP (Groetschel)
##    <p/>
##   </li>
##   <li>
##    <a href="gr96.opt.tour">gr96.opt.tour</a>
##    - 
##    Optimum tour for
##    <i>gr96</i>
##    <p/>
##   </li>
##   <li>
##    <a href="hk48.tsp">hk48.tsp</a>
##    - 
##    48-city problem (Held/Karp)
##    <p/>
##   </li>
##   <li>
##    <a href="kroA100.tsp">kroa100.tsp</a>
##    - 
##    100-city problem A (Krolak/Felts/Nelson)
##    <p/>
##   </li>
##   <li>
##    <a href="kroA100.opt.tour">kroa100.opt.tour</a>
##    - 
##    Optimum for
##    <i>kroa100</i>
##    <p/>
##   </li>
##   <li>
##    <a href="kroA150.tsp">kroa150.tsp</a>
##    - 
##    150-city problem A (Krolak/Felts/Nelson)
##    <p/>
##   </li>
##   <li>
##    <a href="kroA200.tsp">kroa200.tsp</a>
##    - 
##    200-city problem A (Krolak/Felts/Nelson)
##    <p/>
##   </li>
##   <li>
##    <a href="kroB100.tsp">krob100.tsp</a>
##    - 
##    100-city problem B (Krolak/Felts/Nelson)
##    <p/>
##   </li>
##   <li>
##    <a href="kroB150.tsp">krob150.tsp</a>
##    - 
##    150-city problem B (Krolak/Felts/Nelson)
##    <p/>
##   </li>
##   <li>
##    <a href="kroB200.tsp">krob200.tsp</a>
##    - 
##    200-city problem B (Krolak/Felts/Nelson)
##    <p/>
##   </li>
##   <li>
##    <a href="kroC100.tsp">kroc100.tsp</a>
##    - 
##    100-city problem C (Krolak/Felts/Nelson)
##    <p/>
##   </li>
##   <li>
##    <a href="kroC100.opt.tour">kroc100.opt.tour</a>
##    - 
##    Optimum tour for
##    <i>kroc100</i>
##    <p/>
##   </li>
##   <li>
##    <a href="kroD100.tsp">krod100.tsp</a>
##    - 
##    100-city problem D (Krolak/Felts/Nelson)
##    <p/>
##   </li>
##   <li>
##    <a href="kroD100.opt.tour">krod100.opt.tour</a>
##    - 
##    Optimum tour for
##    <i>krod100</i>
##    <p/>
##   </li>
##   <li>
##    <a href="kroE100.tsp">kroe100.tsp</a>
##    - 
##    100-city problem E (Krolak/Felts/Nelson)
##    <p/>
##   </li>
##   <li>
##    <a href="lin105.tsp">lin105.tsp</a>
##    - 
##    105-city problem (Subproblem of lin318)
##    <p/>
##   </li>
##   <li>
##    <a href="lin105.opt.tour">lin105.opt.tour</a>
##    - 
##    Optimum tour for
##    <i>lin105</i>
##    <p/>
##   </li>
##   <li>
##    <a href="lin318.tsp">lin318.tsp</a>
##    - 
##    318-city problem (Lin/Kernighan)
##    <p/>
##   </li>
##   <li>
##    <a href="linhp318.tsp">linhp318.tsp</a>
##    - 
##    Original 318-city problem (Lin/Kernighan)
##    <p/>
##   </li>
##   <li>
##    <a href="nrw1379.tsp">nrw1379.tsp</a>
##    - 
##    Nordrhein-Westfalen (Bachem/Wottawa)
##    <p/>
##   </li>
##   <li>
##    <a href="p654.tsp">p654.tsp</a>
##    - 
##    Drilling problem (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="pa561.tsp">pa561.tsp</a>
##    - 
##    561-city problem (Kleinschmidt)
##    <p/>
##   </li>
##   <li>
##    <a href="pa561.opt.tour">pa561.opt.tour</a>
##    - 
##    Optimal.tour&quot;&gt;  -  for pa561
##    <p/>
##   </li>
##   <li>
##    <a href="pcb1173.tsp">pcb1173.tsp</a>
##    - 
##    Drilling problem (Juenger/Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="pcb3038.tsp">pcb3038.tsp</a>
##    - 
##    Drilling problem (Juenger/Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="pcb442.tsp">pcb442.tsp</a>
##    - 
##    Drilling problem (Groetschel/Juenger/Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="pcb442.opt.tour">pcb442.opt.tour</a>
##    - 
##    Optimum solution for
##    <i>pcb442</i>
##    <p/>
##   </li>
##   <li>
##    <a href="pla33810.tsp">pla33810.tsp</a>
##    - 
##    Programmed logic array (D.S.Johnson)
##    <p/>
##   </li>
##   <li>
##    <a href="pla7397.tsp">pla7397.tsp</a>
##    - 
##    Programmed logic array (D.S.Johnson)
##    <p/>
##   </li>
##   <li>
##    <a href="pla85900.tsp">pla85900.tsp</a>
##    - 
##    Programmed logic array (D.S.Johnson)
##    <p/>
##   </li>
##   <li>
##    <a href="pr1002.tsp">pr1002.tsp</a>
##    - 
##    1002-city problem (Padberg/Rinaldi)
##    <p/>
##   </li>
##   <li>
##    <a href="pr1002.opt.tour">pr1002.opt.tour</a>
##    - 
##    optimal.tour&quot;&gt;  -  for pr1002
##    <p/>
##   </li>
##   <li>
##    <a href="pr107.tsp">pr107.tsp</a>
##    - 
##    107-city problem (Padberg/Rinaldi)
##    <p/>
##   </li>
##   <li>
##    <a href="pr124.tsp">pr124.tsp</a>
##    - 
##    124-city problem (Padberg/Rinaldi)
##    <p/>
##   </li>
##   <li>
##    <a href="pr136.tsp">pr136.tsp</a>
##    - 
##    136-city problem (Padberg/Rinaldi)
##    <p/>
##   </li>
##   <li>
##    <a href="pr144.tsp">pr144.tsp</a>
##    - 
##    144-city problem (Padberg/Rinaldi)
##    <p/>
##   </li>
##   <li>
##    <a href="pr152.tsp">pr152.tsp</a>
##    - 
##    152-city problem (Padberg/Rinaldi)
##    <p/>
##   </li>
##   <li>
##    <a href="pr226.tsp">pr226.tsp</a>
##    - 
##    226-city problem (Padberg/Rinaldi)
##    <p/>
##   </li>
##   <li>
##    <a href="pr2392.tsp">pr2392.tsp</a>
##    - 
##    2392-city problem (Padberg/Rinaldi)
##    <p/>
##   </li>
##   <li>
##    <a href="pr2392.opt.tour">pr2392.opt.tour</a>
##    - 
##    Optimum solution for
##    <i>pr2392</i>
##    <p/>
##   </li>
##   <li>
##    <a href="pr264.tsp">pr264.tsp</a>
##    - 
##    264-city problem (Padberg/Rinaldi)
##    <p/>
##   </li>
##   <li>
##    <a href="pr299.tsp">pr299.tsp</a>
##    - 
##    299-city problem (Padberg/Rinaldi)
##    <p/>
##   </li>
##   <li>
##    <a href="pr439.tsp">pr439.tsp</a>
##    - 
##    439-city problem (Padberg/Rinaldi)
##    <p/>
##   </li>
##   <li>
##    <a href="pr76.tsp">pr76.tsp</a>
##    - 
##    76-city problem (Padberg/Rinaldi)
##    <p/>
##   </li>
##   <li>
##    <a href="pr76.opt.tour">pr76.opt.tour</a>
##    - 
##    Optimum tour for
##    <i>pr76</i>
##    <p/>
##   </li>
##   <li>
##    <a href="rat195.tsp">rat195.tsp</a>
##    - 
##    Rattled grid (Pulleyblank)
##    <p/>
##   </li>
##   <li>
##    <a href="rat575.tsp">rat575.tsp</a>
##    - 
##    Rattled grid (Pulleyblank)
##    <p/>
##   </li>
##   <li>
##    <a href="rat783.tsp">rat783.tsp</a>
##    - 
##    Rattled grid (Pulleyblank)
##    <p/>
##   </li>
##   <li>
##    <a href="rat99.tsp">rat99.tsp</a>
##    - 
##    Rattled grid (Pulleyblank)
##    <p/>
##   </li>
##   <li>
##    <a href="rd100.tsp">rd100.tsp</a>
##    - 
##    100-city random TSP (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="rd100.opt.tour">rd100.opt.tour</a>
##    - 
##    Optimum solution for
##    <i>rd100</i>
##    <p/>
##   </li>
##   <li>
##    <a href="rd400.tsp">rd400.tsp</a>
##    - 
##    400-city random TSP (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="rl11849.tsp">rl11849.tsp</a>
##    - 
##    11849-city TSP (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="rl1304.tsp">rl1304.tsp</a>
##    - 
##    1304-city TSP (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="rl1323.tsp">rl1323.tsp</a>
##    - 
##    1323-city TSP (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="rl1889.tsp">rl1889.tsp</a>
##    - 
##    1889-city TSP (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="rl5915.tsp">rl5915.tsp</a>
##    - 
##    5915-city TSP (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="rl5934.tsp">rl5934.tsp</a>
##    - 
##    5934-city TSP (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="si1032.tsp">si1032.tsp</a>
##    - 
##    1032-vertex TSP (M. Hofmeister)
##    <p/>
##   </li>
##   <li>
##    <a href="si175.tsp">si175.tsp</a>
##    - 
##    175-vertex TSP (M. Hofmeister)
##    <p/>
##   </li>
##   <li>
##    <a href="si535.tsp">si535.tsp</a>
##    - 
##    535-vertex TSP (M. Hofmeister)
##    <p/>
##   </li>
##   <li>
##    <a href="st70.tsp">st70.tsp</a>
##    - 
##    70-city problem (Smith/Thompson)
##    <p/>
##   </li>
##   <li>
##    <a href="st70.opt.tour">st70.opt.tour</a>
##    - 
##    Optimum tour for
##    <i>st70</i>
##    <p/>
##   </li>
##   <li>
##    <a href="swiss42.tsp">swiss42.tsp</a>
##    - 
##    42 cities Switzerland (Fricker)
##    <p/>
##   </li>
##   <li>
##    <a href="ts225.tsp">ts225.tsp</a>
##    - 
##    225-city problem (Juenger,Raecke,Tschoecke)
##    <p/>
##   </li>
##   <li>
##    <a href="tsp225.tsp">tsp225.tsp</a>
##    -
##    A TSP problem (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href=".tsp&quot;">225.opt.tour&quot;&gt;</a>
##    - 
##    Optimal solution for
##    <i>tsp225.tsp</i>
##    <p/>
##   </li>
##   <li>
##    <a href="u1060.tsp">u1060.tsp</a>
##    - 
##    Drilling problem (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="u1432.tsp">u1432.tsp</a>
##    - 
##    Drilling problem (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="u159.tsp">u159.tsp</a>
##    - 
##    Drilling problem (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="u1817.tsp">u1817.tsp</a>
##    - 
##    Drilling problem (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="u2152.tsp">u2152.tsp</a>
##    - 
##    Drilling problem (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="u2319.tsp">u2319.tsp</a>
##    - 
##    Drilling problem (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="u574.tsp">u574.tsp</a>
##    - 
##    Drilling problem (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="u724.tsp">u724.tsp</a>
##    - 
##    Drilling problem (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="ulysses16.tsp">ulysses16.tsp</a>
##    - 
##    Odyssey of Ulysses (Groetschel and Padberg)
##    <p/>
##   </li>
##   <li>
##    <a href="ulysses16.opt.tour">ulysses16.opt.tour</a>
##    - 
##    Optimum tour for
##    <i>ulysses16</i>
##    <p/>
##   </li>
##   <li>
##    <a href="ulysses22.tsp">ulysses22.tsp</a>
##    - 
##    Odyssey of Ulysses (Groetschel and Padberg)
##    <p/>
##   </li>
##   <li>
##    <a href="ulysses22.opt.tour">ulysses22.opt.tour</a>
##    - 
##    Optimum tour for
##    <i>ulysses22</i>
##    <p/>
##   </li>
##   <li>
##    <a href="usa13509.tsp">usa13509.tsp</a>
##    - 
##    Cities with population at least 500 in the continental US (David
##    Applegate and Andre Rohe)
##    <p/>
##   </li>
##   <li>
##    <a href="vm1084.tsp">vm1084.tsp</a>
##    - 
##    1084-city problem (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="vm1748.tsp">vm1748.tsp</a>
##    - 
##    1784-city problem (Reinelt)
##    <p/>
##   </li>
##   <li>
##    <a href="xray.problems">xray.problems</a>
##    -
##    Xray crystallography TSP (F. Nielsen, D. Shallcross) 
##    (source code for generator)
##   </li>
##  </ul>
##  <hr/>
##  <table align="LEFT" cellspacing="0" width="100%">
##   <caption/>
##   <tr align="LEFT">
##    <td align="LEFT">
##     <em>Last update: June 1, 1995</em>
##    </td>
##    <td align="LEFT">
##     <a target="_top" href="http://www.zib.de/adm/uaf.srv?-e+skorobohatyj">
##      <em>Georg Skorobohatyj</em>
##     </a>
##    </td>
##    <td align="RIGHT">
##     <a target="_top" href="http://www.zib.de/">
##      <em>ZIB Homepage</em>
##     </a>
##    </td>
##   </tr>
##  </table>
##  <br/>
##  <br/>
##  <font size="-1">URL: ftp://ftp.zib.de/pub/Packages/mp-testdata/tsplib/tsp/index.html</font>
## </body>
## 
## attr(,"class")
## [1] "XMLNodeList"

A typical html document contains a head and a body. The body contains some data we would like to exract. For instance, in this example we want to find the unstructured list.

body = xmlChildren(ds$body) # body 
ul = xmlChildren(body$ul)   # unstructured list
ul
## $li
## <li>
##  <a href="a280.tsp">a280.tsp</a>
##  -
##    Drilling problem (Ludwig)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="a280.opt.tour">a280.opt.tour</a>
##  - 
##    Optimum tour for
##  <i>a280</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="ali535.tsp">ali535.tsp</a>
##  -  
##    535 Airports around the globe (Padberg/Rinaldi)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="att48.tsp">att48.tsp</a>
##  - 
##    48 capitals of the US (Padberg/Rinaldi)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="att48.opt.tour">att48.opt.tour</a>
##  - 
##    Optimum solution for
##  <i>att48</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="att532.tsp">att532.tsp</a>
##  -  
##    532-city problem (Padberg/Rinaldi)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="bayg29.tsp">bayg29.tsp</a>
##  - 
##    29 Cities in Bavaria (geographical distance)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="bayg29.opt.tour">bayg29.opt.tour</a>
##  -   
##    Optimum solution of
##  <i>bayg29</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="bays29.tsp">bays29.tsp</a>
##  - 
##    29 Cities in Bavaria (street distance)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="bays29.opt.tour">bays29.opt.tour</a>
##  -
##    Optimum solution of
##  <i>bays29</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="berlin52.tsp">berlin52.tsp</a>
##  - 
##    52 locations in Berlin (Germany) (Groetschel)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="berlin52.opt.tour">berlin52.opt.tour</a>
##  -
##    Optimum tour for
##  <i>berlin52</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="bier127.tsp">bier127.tsp</a>
##  - 
##    127 beergardens in the Augsburg (Germany) area (Juenger/Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="brazil58.tsp">brazil58.tsp</a>
##  - 
##    58 cities in Brazil (C. Ferreira)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="brd14051.tsp">brd14051.tsp</a>
##  - 
##    Federal Republic of Germany with borders as of 1989 (Bachem/Wottawa)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="brg180.tsp">brg180.tsp</a>
##  - 
##    Bridge tournament problem (Rinaldi)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="brg180.opt.tour">brg180.opt.tour</a>
##  - 
##    Optimum tour for
##  <i>brg180</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="burma14.tsp">burma14.tsp</a>
##  - 
##    14 cities in Burma (geographical coordinates)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="ch130.tsp">ch130.tsp</a>
##  - 
##    130 city problem (Churritz)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="ch130.opt.tour">ch130.opt.tour</a>
##  - 
##    Optimum tour for
##  <i>ch130</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="ch150.tsp">ch150.tsp</a>
##  150 city problem (Churritz)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="ch150.opt.tour">ch150.opt.tour</a>
##  - 
##    Optimum tour for
##  <i>ch150</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="d1291.tsp">d1291.tsp</a>
##  - 
##    Drilling problem (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="d1655.tsp">d1655.tsp</a>
##  - 
##    Drilling problem (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="d18512.tsp">d18512.tsp</a>
##  - 
##    Federal Republic of Germany (with ex-GDR territory) (Bachem/Wottawa)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="d198.tsp">d198.tsp</a>
##  - 
##    Drilling problem (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="d2103.tsp">d2103.tsp</a>
##  - 
##    Drilling problem (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="d493.tsp">d493.tsp</a>
##  - 
##    Drilling problem (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="d657.tsp">d657.tsp</a>
##  - 
##    Drilling problem (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="dantzig42.tsp">dantzig42.tsp</a>
##  - 
##    42 cities (Dantzig)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="dsj1000.tsp">dsj1000.tsp</a>
##  - 
##    Clustered random problem (Johnson)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="eil101.tsp">eil101.tsp</a>
##  -  
##    101-city problem (Christofides/Eilon)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="eil101.opt.tour">eil101.opt.tour</a>
##  - 
##    Optimum tour for
##  <i>eil101</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="eil51.tsp">eil51.tsp</a>
##  - 
##    51-city problem (Christofides/Eilon)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="eil51.opt.tour">eil51.opt.tour</a>
##  - 
##    Optimum tour for
##  <i>eil51</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="eil76.tsp">eil76.tsp</a>
##  - 
##    76-city problem (Christofides/Eilon)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="eil76.opt.tour">eil76.opt.tour</a>
##  - 
##    Optimum tour for
##  <i>eil76</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="fl1400.tsp">fl1400.tsp</a>
##  - 
##    Drilling problem (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="fl1577.tsp">fl1577.tsp</a>
##  - 
##    Drilling problem (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="fl3795.tsp">fl3795.tsp</a>
##  - 
##    Drilling problem (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="fl417.tsp">fl417.tsp</a>
##  - 
##    Drilling problem (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="fnl4461.tsp">fnl4461.tsp</a>
##  - 
##    The five new Federal States of Germany (ex-GDR territory)
##    (Bachem/Wottawa)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="fri26.tsp">fri26.tsp</a>
##  - 
##    26-city problem (Fricker)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="fri26.opt.tour">fri26.opt.tour</a>
##  - 
##    Optimum tour for
##  <i>fri26</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="gil262.tsp">gil262.ts</a>
##  - 
##    262-city problem (Gillet/Johnson)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="gr120.tsp">gr120.tsp</a>
##  - 
##    120 cities in Germany (Groetschel)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="gr120.opt.tour">gr120.opt.tour</a>
##  - 
##    Optimum tour for
##  <i>gr120</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="gr137.tsp">gr137.tsp</a>
##  - 
##    America-Subproblem of 666-city TSP (Groetschel)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="gr17.tsp">gr17.tsp</a>
##  - 
##    17-city problem (Groetschel)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="gr202.tsp">gr202.tsp</a>
##  - 
##    Europe-Subproblem of 666-city TSP (Groetschel)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="gr202.opt.tour">gr202.opt.tour</a>
##  - 
##    Optimum solution for
##  <i>gr202</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="gr21.tsp">gr21.tsp</a>
##  - 
##    21-city problem (Groetschel)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="gr229.tsp">gr229.tsp</a>
##  - 
##    Asia/Australia-Subproblem of 666-city TSP (Groetschel)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="gr24.tsp">gr24.tsp</a>
##  - 
##    24-city problem (Groetschel)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="gr24.opt.tour">gr24.opt.tour</a>
##  - 
##    Optimum solution for
##  <i>gr24</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="gr431.tsp">gr431.tsp</a>
##  - 
##    Europe/Asia/Australia-Subproblem of 666-city TSP (Groetschel)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="gr48.tsp">gr48.tsp</a>
##  - 
##    48-city problem (Groetschel)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="gr48.opt.tour">gr48.opt.tour</a>
##  - 
##    Optimum solution for
##  <i>gr48</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="gr666.tsp">gr666.tsp</a>
##  - 
##    666 cities around the world (Groetschel)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="gr666.opt.tour">gr666.opt.tour</a>
##  - 
##    Optimum solution of
##  <i>gr666</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="gr96.tsp">gr96.tsp</a>
##  - 
##    Africa-Subproblem of 666-city TSP (Groetschel)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="gr96.opt.tour">gr96.opt.tour</a>
##  - 
##    Optimum tour for
##  <i>gr96</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="hk48.tsp">hk48.tsp</a>
##  - 
##    48-city problem (Held/Karp)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="kroA100.tsp">kroa100.tsp</a>
##  - 
##    100-city problem A (Krolak/Felts/Nelson)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="kroA100.opt.tour">kroa100.opt.tour</a>
##  - 
##    Optimum for
##  <i>kroa100</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="kroA150.tsp">kroa150.tsp</a>
##  - 
##    150-city problem A (Krolak/Felts/Nelson)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="kroA200.tsp">kroa200.tsp</a>
##  - 
##    200-city problem A (Krolak/Felts/Nelson)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="kroB100.tsp">krob100.tsp</a>
##  - 
##    100-city problem B (Krolak/Felts/Nelson)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="kroB150.tsp">krob150.tsp</a>
##  - 
##    150-city problem B (Krolak/Felts/Nelson)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="kroB200.tsp">krob200.tsp</a>
##  - 
##    200-city problem B (Krolak/Felts/Nelson)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="kroC100.tsp">kroc100.tsp</a>
##  - 
##    100-city problem C (Krolak/Felts/Nelson)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="kroC100.opt.tour">kroc100.opt.tour</a>
##  - 
##    Optimum tour for
##  <i>kroc100</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="kroD100.tsp">krod100.tsp</a>
##  - 
##    100-city problem D (Krolak/Felts/Nelson)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="kroD100.opt.tour">krod100.opt.tour</a>
##  - 
##    Optimum tour for
##  <i>krod100</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="kroE100.tsp">kroe100.tsp</a>
##  - 
##    100-city problem E (Krolak/Felts/Nelson)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="lin105.tsp">lin105.tsp</a>
##  - 
##    105-city problem (Subproblem of lin318)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="lin105.opt.tour">lin105.opt.tour</a>
##  - 
##    Optimum tour for
##  <i>lin105</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="lin318.tsp">lin318.tsp</a>
##  - 
##    318-city problem (Lin/Kernighan)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="linhp318.tsp">linhp318.tsp</a>
##  - 
##    Original 318-city problem (Lin/Kernighan)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="nrw1379.tsp">nrw1379.tsp</a>
##  - 
##    Nordrhein-Westfalen (Bachem/Wottawa)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="p654.tsp">p654.tsp</a>
##  - 
##    Drilling problem (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pa561.tsp">pa561.tsp</a>
##  - 
##    561-city problem (Kleinschmidt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pa561.opt.tour">pa561.opt.tour</a>
##  - 
##    Optimal.tour&quot;&gt;  -  for pa561
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pcb1173.tsp">pcb1173.tsp</a>
##  - 
##    Drilling problem (Juenger/Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pcb3038.tsp">pcb3038.tsp</a>
##  - 
##    Drilling problem (Juenger/Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pcb442.tsp">pcb442.tsp</a>
##  - 
##    Drilling problem (Groetschel/Juenger/Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pcb442.opt.tour">pcb442.opt.tour</a>
##  - 
##    Optimum solution for
##  <i>pcb442</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pla33810.tsp">pla33810.tsp</a>
##  - 
##    Programmed logic array (D.S.Johnson)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pla7397.tsp">pla7397.tsp</a>
##  - 
##    Programmed logic array (D.S.Johnson)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pla85900.tsp">pla85900.tsp</a>
##  - 
##    Programmed logic array (D.S.Johnson)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pr1002.tsp">pr1002.tsp</a>
##  - 
##    1002-city problem (Padberg/Rinaldi)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pr1002.opt.tour">pr1002.opt.tour</a>
##  - 
##    optimal.tour&quot;&gt;  -  for pr1002
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pr107.tsp">pr107.tsp</a>
##  - 
##    107-city problem (Padberg/Rinaldi)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pr124.tsp">pr124.tsp</a>
##  - 
##    124-city problem (Padberg/Rinaldi)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pr136.tsp">pr136.tsp</a>
##  - 
##    136-city problem (Padberg/Rinaldi)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pr144.tsp">pr144.tsp</a>
##  - 
##    144-city problem (Padberg/Rinaldi)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pr152.tsp">pr152.tsp</a>
##  - 
##    152-city problem (Padberg/Rinaldi)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pr226.tsp">pr226.tsp</a>
##  - 
##    226-city problem (Padberg/Rinaldi)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pr2392.tsp">pr2392.tsp</a>
##  - 
##    2392-city problem (Padberg/Rinaldi)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pr2392.opt.tour">pr2392.opt.tour</a>
##  - 
##    Optimum solution for
##  <i>pr2392</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pr264.tsp">pr264.tsp</a>
##  - 
##    264-city problem (Padberg/Rinaldi)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pr299.tsp">pr299.tsp</a>
##  - 
##    299-city problem (Padberg/Rinaldi)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pr439.tsp">pr439.tsp</a>
##  - 
##    439-city problem (Padberg/Rinaldi)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pr76.tsp">pr76.tsp</a>
##  - 
##    76-city problem (Padberg/Rinaldi)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="pr76.opt.tour">pr76.opt.tour</a>
##  - 
##    Optimum tour for
##  <i>pr76</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="rat195.tsp">rat195.tsp</a>
##  - 
##    Rattled grid (Pulleyblank)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="rat575.tsp">rat575.tsp</a>
##  - 
##    Rattled grid (Pulleyblank)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="rat783.tsp">rat783.tsp</a>
##  - 
##    Rattled grid (Pulleyblank)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="rat99.tsp">rat99.tsp</a>
##  - 
##    Rattled grid (Pulleyblank)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="rd100.tsp">rd100.tsp</a>
##  - 
##    100-city random TSP (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="rd100.opt.tour">rd100.opt.tour</a>
##  - 
##    Optimum solution for
##  <i>rd100</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="rd400.tsp">rd400.tsp</a>
##  - 
##    400-city random TSP (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="rl11849.tsp">rl11849.tsp</a>
##  - 
##    11849-city TSP (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="rl1304.tsp">rl1304.tsp</a>
##  - 
##    1304-city TSP (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="rl1323.tsp">rl1323.tsp</a>
##  - 
##    1323-city TSP (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="rl1889.tsp">rl1889.tsp</a>
##  - 
##    1889-city TSP (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="rl5915.tsp">rl5915.tsp</a>
##  - 
##    5915-city TSP (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="rl5934.tsp">rl5934.tsp</a>
##  - 
##    5934-city TSP (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="si1032.tsp">si1032.tsp</a>
##  - 
##    1032-vertex TSP (M. Hofmeister)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="si175.tsp">si175.tsp</a>
##  - 
##    175-vertex TSP (M. Hofmeister)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="si535.tsp">si535.tsp</a>
##  - 
##    535-vertex TSP (M. Hofmeister)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="st70.tsp">st70.tsp</a>
##  - 
##    70-city problem (Smith/Thompson)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="st70.opt.tour">st70.opt.tour</a>
##  - 
##    Optimum tour for
##  <i>st70</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="swiss42.tsp">swiss42.tsp</a>
##  - 
##    42 cities Switzerland (Fricker)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="ts225.tsp">ts225.tsp</a>
##  - 
##    225-city problem (Juenger,Raecke,Tschoecke)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="tsp225.tsp">tsp225.tsp</a>
##  -
##    A TSP problem (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href=".tsp&quot;">225.opt.tour&quot;&gt;</a>
##  - 
##    Optimal solution for
##  <i>tsp225.tsp</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="u1060.tsp">u1060.tsp</a>
##  - 
##    Drilling problem (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="u1432.tsp">u1432.tsp</a>
##  - 
##    Drilling problem (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="u159.tsp">u159.tsp</a>
##  - 
##    Drilling problem (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="u1817.tsp">u1817.tsp</a>
##  - 
##    Drilling problem (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="u2152.tsp">u2152.tsp</a>
##  - 
##    Drilling problem (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="u2319.tsp">u2319.tsp</a>
##  - 
##    Drilling problem (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="u574.tsp">u574.tsp</a>
##  - 
##    Drilling problem (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="u724.tsp">u724.tsp</a>
##  - 
##    Drilling problem (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="ulysses16.tsp">ulysses16.tsp</a>
##  - 
##    Odyssey of Ulysses (Groetschel and Padberg)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="ulysses16.opt.tour">ulysses16.opt.tour</a>
##  - 
##    Optimum tour for
##  <i>ulysses16</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="ulysses22.tsp">ulysses22.tsp</a>
##  - 
##    Odyssey of Ulysses (Groetschel and Padberg)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="ulysses22.opt.tour">ulysses22.opt.tour</a>
##  - 
##    Optimum tour for
##  <i>ulysses22</i>
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="usa13509.tsp">usa13509.tsp</a>
##  - 
##    Cities with population at least 500 in the continental US (David
##    Applegate and Andre Rohe)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="vm1084.tsp">vm1084.tsp</a>
##  - 
##    1084-city problem (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="vm1748.tsp">vm1748.tsp</a>
##  - 
##    1784-city problem (Reinelt)
##  <p/>
## </li>
## 
## $li
## <li>
##  <a href="xray.problems">xray.problems</a>
##  -
##    Xray crystallography TSP (F. Nielsen, D. Shallcross) 
##    (source code for generator)
## </li>
## 
## attr(,"class")
## [1] "XMLNodeList"

Let us investigate one element in more detail.

e = ul[[1]]    #extract first element
ce <- e %>% xmlChildren() # look at its children
#ce$a %>% xmlChildren()    # get the anchor's children
href = ce$a %>% xmlAttrs() # this has a hyper-reference attribute
txt = str_trim(ce$text %>% xmlValue()) # get the description
txt = str_trim(str_remove(txt,'-\n'))  # and clean it
cat("file url:",href,"\ndescription:",txt)
## file url: a280.tsp 
## description: Drilling problem (Ludwig)

Gathering information quickly

Once the structure of the document is understood. It is possible to extract information quickly. For instance, let us say we want the unstructure list identified with the html-tag <ul>in the document. //ul parses the path until it hits the ul tag.

ul2  = getNodeSet(root, "//ul") # go there directly
ul2
## [[1]]
## <ul>
##  <li>
##   <a href="a280.tsp">a280.tsp</a>
##   -
##    Drilling problem (Ludwig)
##   <p/>
##  </li>
##  <li>
##   <a href="a280.opt.tour">a280.opt.tour</a>
##   - 
##    Optimum tour for
##   <i>a280</i>
##   <p/>
##  </li>
##  <li>
##   <a href="ali535.tsp">ali535.tsp</a>
##   -  
##    535 Airports around the globe (Padberg/Rinaldi)
##   <p/>
##  </li>
##  <li>
##   <a href="att48.tsp">att48.tsp</a>
##   - 
##    48 capitals of the US (Padberg/Rinaldi)
##   <p/>
##  </li>
##  <li>
##   <a href="att48.opt.tour">att48.opt.tour</a>
##   - 
##    Optimum solution for
##   <i>att48</i>
##   <p/>
##  </li>
##  <li>
##   <a href="att532.tsp">att532.tsp</a>
##   -  
##    532-city problem (Padberg/Rinaldi)
##   <p/>
##  </li>
##  <li>
##   <a href="bayg29.tsp">bayg29.tsp</a>
##   - 
##    29 Cities in Bavaria (geographical distance)
##   <p/>
##  </li>
##  <li>
##   <a href="bayg29.opt.tour">bayg29.opt.tour</a>
##   -   
##    Optimum solution of
##   <i>bayg29</i>
##   <p/>
##  </li>
##  <li>
##   <a href="bays29.tsp">bays29.tsp</a>
##   - 
##    29 Cities in Bavaria (street distance)
##   <p/>
##  </li>
##  <li>
##   <a href="bays29.opt.tour">bays29.opt.tour</a>
##   -
##    Optimum solution of
##   <i>bays29</i>
##   <p/>
##  </li>
##  <li>
##   <a href="berlin52.tsp">berlin52.tsp</a>
##   - 
##    52 locations in Berlin (Germany) (Groetschel)
##   <p/>
##  </li>
##  <li>
##   <a href="berlin52.opt.tour">berlin52.opt.tour</a>
##   -
##    Optimum tour for
##   <i>berlin52</i>
##   <p/>
##  </li>
##  <li>
##   <a href="bier127.tsp">bier127.tsp</a>
##   - 
##    127 beergardens in the Augsburg (Germany) area (Juenger/Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="brazil58.tsp">brazil58.tsp</a>
##   - 
##    58 cities in Brazil (C. Ferreira)
##   <p/>
##  </li>
##  <li>
##   <a href="brd14051.tsp">brd14051.tsp</a>
##   - 
##    Federal Republic of Germany with borders as of 1989 (Bachem/Wottawa)
##   <p/>
##  </li>
##  <li>
##   <a href="brg180.tsp">brg180.tsp</a>
##   - 
##    Bridge tournament problem (Rinaldi)
##   <p/>
##  </li>
##  <li>
##   <a href="brg180.opt.tour">brg180.opt.tour</a>
##   - 
##    Optimum tour for
##   <i>brg180</i>
##   <p/>
##  </li>
##  <li>
##   <a href="burma14.tsp">burma14.tsp</a>
##   - 
##    14 cities in Burma (geographical coordinates)
##   <p/>
##  </li>
##  <li>
##   <a href="ch130.tsp">ch130.tsp</a>
##   - 
##    130 city problem (Churritz)
##   <p/>
##  </li>
##  <li>
##   <a href="ch130.opt.tour">ch130.opt.tour</a>
##   - 
##    Optimum tour for
##   <i>ch130</i>
##   <p/>
##  </li>
##  <li>
##   <a href="ch150.tsp">ch150.tsp</a>
##   150 city problem (Churritz)
##   <p/>
##  </li>
##  <li>
##   <a href="ch150.opt.tour">ch150.opt.tour</a>
##   - 
##    Optimum tour for
##   <i>ch150</i>
##   <p/>
##  </li>
##  <li>
##   <a href="d1291.tsp">d1291.tsp</a>
##   - 
##    Drilling problem (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="d1655.tsp">d1655.tsp</a>
##   - 
##    Drilling problem (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="d18512.tsp">d18512.tsp</a>
##   - 
##    Federal Republic of Germany (with ex-GDR territory) (Bachem/Wottawa)
##   <p/>
##  </li>
##  <li>
##   <a href="d198.tsp">d198.tsp</a>
##   - 
##    Drilling problem (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="d2103.tsp">d2103.tsp</a>
##   - 
##    Drilling problem (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="d493.tsp">d493.tsp</a>
##   - 
##    Drilling problem (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="d657.tsp">d657.tsp</a>
##   - 
##    Drilling problem (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="dantzig42.tsp">dantzig42.tsp</a>
##   - 
##    42 cities (Dantzig)
##   <p/>
##  </li>
##  <li>
##   <a href="dsj1000.tsp">dsj1000.tsp</a>
##   - 
##    Clustered random problem (Johnson)
##   <p/>
##  </li>
##  <li>
##   <a href="eil101.tsp">eil101.tsp</a>
##   -  
##    101-city problem (Christofides/Eilon)
##   <p/>
##  </li>
##  <li>
##   <a href="eil101.opt.tour">eil101.opt.tour</a>
##   - 
##    Optimum tour for
##   <i>eil101</i>
##   <p/>
##  </li>
##  <li>
##   <a href="eil51.tsp">eil51.tsp</a>
##   - 
##    51-city problem (Christofides/Eilon)
##   <p/>
##  </li>
##  <li>
##   <a href="eil51.opt.tour">eil51.opt.tour</a>
##   - 
##    Optimum tour for
##   <i>eil51</i>
##   <p/>
##  </li>
##  <li>
##   <a href="eil76.tsp">eil76.tsp</a>
##   - 
##    76-city problem (Christofides/Eilon)
##   <p/>
##  </li>
##  <li>
##   <a href="eil76.opt.tour">eil76.opt.tour</a>
##   - 
##    Optimum tour for
##   <i>eil76</i>
##   <p/>
##  </li>
##  <li>
##   <a href="fl1400.tsp">fl1400.tsp</a>
##   - 
##    Drilling problem (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="fl1577.tsp">fl1577.tsp</a>
##   - 
##    Drilling problem (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="fl3795.tsp">fl3795.tsp</a>
##   - 
##    Drilling problem (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="fl417.tsp">fl417.tsp</a>
##   - 
##    Drilling problem (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="fnl4461.tsp">fnl4461.tsp</a>
##   - 
##    The five new Federal States of Germany (ex-GDR territory)
##    (Bachem/Wottawa)
##   <p/>
##  </li>
##  <li>
##   <a href="fri26.tsp">fri26.tsp</a>
##   - 
##    26-city problem (Fricker)
##   <p/>
##  </li>
##  <li>
##   <a href="fri26.opt.tour">fri26.opt.tour</a>
##   - 
##    Optimum tour for
##   <i>fri26</i>
##   <p/>
##  </li>
##  <li>
##   <a href="gil262.tsp">gil262.ts</a>
##   - 
##    262-city problem (Gillet/Johnson)
##   <p/>
##  </li>
##  <li>
##   <a href="gr120.tsp">gr120.tsp</a>
##   - 
##    120 cities in Germany (Groetschel)
##   <p/>
##  </li>
##  <li>
##   <a href="gr120.opt.tour">gr120.opt.tour</a>
##   - 
##    Optimum tour for
##   <i>gr120</i>
##   <p/>
##  </li>
##  <li>
##   <a href="gr137.tsp">gr137.tsp</a>
##   - 
##    America-Subproblem of 666-city TSP (Groetschel)
##   <p/>
##  </li>
##  <li>
##   <a href="gr17.tsp">gr17.tsp</a>
##   - 
##    17-city problem (Groetschel)
##   <p/>
##  </li>
##  <li>
##   <a href="gr202.tsp">gr202.tsp</a>
##   - 
##    Europe-Subproblem of 666-city TSP (Groetschel)
##   <p/>
##  </li>
##  <li>
##   <a href="gr202.opt.tour">gr202.opt.tour</a>
##   - 
##    Optimum solution for
##   <i>gr202</i>
##   <p/>
##  </li>
##  <li>
##   <a href="gr21.tsp">gr21.tsp</a>
##   - 
##    21-city problem (Groetschel)
##   <p/>
##  </li>
##  <li>
##   <a href="gr229.tsp">gr229.tsp</a>
##   - 
##    Asia/Australia-Subproblem of 666-city TSP (Groetschel)
##   <p/>
##  </li>
##  <li>
##   <a href="gr24.tsp">gr24.tsp</a>
##   - 
##    24-city problem (Groetschel)
##   <p/>
##  </li>
##  <li>
##   <a href="gr24.opt.tour">gr24.opt.tour</a>
##   - 
##    Optimum solution for
##   <i>gr24</i>
##   <p/>
##  </li>
##  <li>
##   <a href="gr431.tsp">gr431.tsp</a>
##   - 
##    Europe/Asia/Australia-Subproblem of 666-city TSP (Groetschel)
##   <p/>
##  </li>
##  <li>
##   <a href="gr48.tsp">gr48.tsp</a>
##   - 
##    48-city problem (Groetschel)
##   <p/>
##  </li>
##  <li>
##   <a href="gr48.opt.tour">gr48.opt.tour</a>
##   - 
##    Optimum solution for
##   <i>gr48</i>
##   <p/>
##  </li>
##  <li>
##   <a href="gr666.tsp">gr666.tsp</a>
##   - 
##    666 cities around the world (Groetschel)
##   <p/>
##  </li>
##  <li>
##   <a href="gr666.opt.tour">gr666.opt.tour</a>
##   - 
##    Optimum solution of
##   <i>gr666</i>
##   <p/>
##  </li>
##  <li>
##   <a href="gr96.tsp">gr96.tsp</a>
##   - 
##    Africa-Subproblem of 666-city TSP (Groetschel)
##   <p/>
##  </li>
##  <li>
##   <a href="gr96.opt.tour">gr96.opt.tour</a>
##   - 
##    Optimum tour for
##   <i>gr96</i>
##   <p/>
##  </li>
##  <li>
##   <a href="hk48.tsp">hk48.tsp</a>
##   - 
##    48-city problem (Held/Karp)
##   <p/>
##  </li>
##  <li>
##   <a href="kroA100.tsp">kroa100.tsp</a>
##   - 
##    100-city problem A (Krolak/Felts/Nelson)
##   <p/>
##  </li>
##  <li>
##   <a href="kroA100.opt.tour">kroa100.opt.tour</a>
##   - 
##    Optimum for
##   <i>kroa100</i>
##   <p/>
##  </li>
##  <li>
##   <a href="kroA150.tsp">kroa150.tsp</a>
##   - 
##    150-city problem A (Krolak/Felts/Nelson)
##   <p/>
##  </li>
##  <li>
##   <a href="kroA200.tsp">kroa200.tsp</a>
##   - 
##    200-city problem A (Krolak/Felts/Nelson)
##   <p/>
##  </li>
##  <li>
##   <a href="kroB100.tsp">krob100.tsp</a>
##   - 
##    100-city problem B (Krolak/Felts/Nelson)
##   <p/>
##  </li>
##  <li>
##   <a href="kroB150.tsp">krob150.tsp</a>
##   - 
##    150-city problem B (Krolak/Felts/Nelson)
##   <p/>
##  </li>
##  <li>
##   <a href="kroB200.tsp">krob200.tsp</a>
##   - 
##    200-city problem B (Krolak/Felts/Nelson)
##   <p/>
##  </li>
##  <li>
##   <a href="kroC100.tsp">kroc100.tsp</a>
##   - 
##    100-city problem C (Krolak/Felts/Nelson)
##   <p/>
##  </li>
##  <li>
##   <a href="kroC100.opt.tour">kroc100.opt.tour</a>
##   - 
##    Optimum tour for
##   <i>kroc100</i>
##   <p/>
##  </li>
##  <li>
##   <a href="kroD100.tsp">krod100.tsp</a>
##   - 
##    100-city problem D (Krolak/Felts/Nelson)
##   <p/>
##  </li>
##  <li>
##   <a href="kroD100.opt.tour">krod100.opt.tour</a>
##   - 
##    Optimum tour for
##   <i>krod100</i>
##   <p/>
##  </li>
##  <li>
##   <a href="kroE100.tsp">kroe100.tsp</a>
##   - 
##    100-city problem E (Krolak/Felts/Nelson)
##   <p/>
##  </li>
##  <li>
##   <a href="lin105.tsp">lin105.tsp</a>
##   - 
##    105-city problem (Subproblem of lin318)
##   <p/>
##  </li>
##  <li>
##   <a href="lin105.opt.tour">lin105.opt.tour</a>
##   - 
##    Optimum tour for
##   <i>lin105</i>
##   <p/>
##  </li>
##  <li>
##   <a href="lin318.tsp">lin318.tsp</a>
##   - 
##    318-city problem (Lin/Kernighan)
##   <p/>
##  </li>
##  <li>
##   <a href="linhp318.tsp">linhp318.tsp</a>
##   - 
##    Original 318-city problem (Lin/Kernighan)
##   <p/>
##  </li>
##  <li>
##   <a href="nrw1379.tsp">nrw1379.tsp</a>
##   - 
##    Nordrhein-Westfalen (Bachem/Wottawa)
##   <p/>
##  </li>
##  <li>
##   <a href="p654.tsp">p654.tsp</a>
##   - 
##    Drilling problem (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="pa561.tsp">pa561.tsp</a>
##   - 
##    561-city problem (Kleinschmidt)
##   <p/>
##  </li>
##  <li>
##   <a href="pa561.opt.tour">pa561.opt.tour</a>
##   - 
##    Optimal.tour&quot;&gt;  -  for pa561
##   <p/>
##  </li>
##  <li>
##   <a href="pcb1173.tsp">pcb1173.tsp</a>
##   - 
##    Drilling problem (Juenger/Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="pcb3038.tsp">pcb3038.tsp</a>
##   - 
##    Drilling problem (Juenger/Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="pcb442.tsp">pcb442.tsp</a>
##   - 
##    Drilling problem (Groetschel/Juenger/Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="pcb442.opt.tour">pcb442.opt.tour</a>
##   - 
##    Optimum solution for
##   <i>pcb442</i>
##   <p/>
##  </li>
##  <li>
##   <a href="pla33810.tsp">pla33810.tsp</a>
##   - 
##    Programmed logic array (D.S.Johnson)
##   <p/>
##  </li>
##  <li>
##   <a href="pla7397.tsp">pla7397.tsp</a>
##   - 
##    Programmed logic array (D.S.Johnson)
##   <p/>
##  </li>
##  <li>
##   <a href="pla85900.tsp">pla85900.tsp</a>
##   - 
##    Programmed logic array (D.S.Johnson)
##   <p/>
##  </li>
##  <li>
##   <a href="pr1002.tsp">pr1002.tsp</a>
##   - 
##    1002-city problem (Padberg/Rinaldi)
##   <p/>
##  </li>
##  <li>
##   <a href="pr1002.opt.tour">pr1002.opt.tour</a>
##   - 
##    optimal.tour&quot;&gt;  -  for pr1002
##   <p/>
##  </li>
##  <li>
##   <a href="pr107.tsp">pr107.tsp</a>
##   - 
##    107-city problem (Padberg/Rinaldi)
##   <p/>
##  </li>
##  <li>
##   <a href="pr124.tsp">pr124.tsp</a>
##   - 
##    124-city problem (Padberg/Rinaldi)
##   <p/>
##  </li>
##  <li>
##   <a href="pr136.tsp">pr136.tsp</a>
##   - 
##    136-city problem (Padberg/Rinaldi)
##   <p/>
##  </li>
##  <li>
##   <a href="pr144.tsp">pr144.tsp</a>
##   - 
##    144-city problem (Padberg/Rinaldi)
##   <p/>
##  </li>
##  <li>
##   <a href="pr152.tsp">pr152.tsp</a>
##   - 
##    152-city problem (Padberg/Rinaldi)
##   <p/>
##  </li>
##  <li>
##   <a href="pr226.tsp">pr226.tsp</a>
##   - 
##    226-city problem (Padberg/Rinaldi)
##   <p/>
##  </li>
##  <li>
##   <a href="pr2392.tsp">pr2392.tsp</a>
##   - 
##    2392-city problem (Padberg/Rinaldi)
##   <p/>
##  </li>
##  <li>
##   <a href="pr2392.opt.tour">pr2392.opt.tour</a>
##   - 
##    Optimum solution for
##   <i>pr2392</i>
##   <p/>
##  </li>
##  <li>
##   <a href="pr264.tsp">pr264.tsp</a>
##   - 
##    264-city problem (Padberg/Rinaldi)
##   <p/>
##  </li>
##  <li>
##   <a href="pr299.tsp">pr299.tsp</a>
##   - 
##    299-city problem (Padberg/Rinaldi)
##   <p/>
##  </li>
##  <li>
##   <a href="pr439.tsp">pr439.tsp</a>
##   - 
##    439-city problem (Padberg/Rinaldi)
##   <p/>
##  </li>
##  <li>
##   <a href="pr76.tsp">pr76.tsp</a>
##   - 
##    76-city problem (Padberg/Rinaldi)
##   <p/>
##  </li>
##  <li>
##   <a href="pr76.opt.tour">pr76.opt.tour</a>
##   - 
##    Optimum tour for
##   <i>pr76</i>
##   <p/>
##  </li>
##  <li>
##   <a href="rat195.tsp">rat195.tsp</a>
##   - 
##    Rattled grid (Pulleyblank)
##   <p/>
##  </li>
##  <li>
##   <a href="rat575.tsp">rat575.tsp</a>
##   - 
##    Rattled grid (Pulleyblank)
##   <p/>
##  </li>
##  <li>
##   <a href="rat783.tsp">rat783.tsp</a>
##   - 
##    Rattled grid (Pulleyblank)
##   <p/>
##  </li>
##  <li>
##   <a href="rat99.tsp">rat99.tsp</a>
##   - 
##    Rattled grid (Pulleyblank)
##   <p/>
##  </li>
##  <li>
##   <a href="rd100.tsp">rd100.tsp</a>
##   - 
##    100-city random TSP (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="rd100.opt.tour">rd100.opt.tour</a>
##   - 
##    Optimum solution for
##   <i>rd100</i>
##   <p/>
##  </li>
##  <li>
##   <a href="rd400.tsp">rd400.tsp</a>
##   - 
##    400-city random TSP (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="rl11849.tsp">rl11849.tsp</a>
##   - 
##    11849-city TSP (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="rl1304.tsp">rl1304.tsp</a>
##   - 
##    1304-city TSP (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="rl1323.tsp">rl1323.tsp</a>
##   - 
##    1323-city TSP (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="rl1889.tsp">rl1889.tsp</a>
##   - 
##    1889-city TSP (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="rl5915.tsp">rl5915.tsp</a>
##   - 
##    5915-city TSP (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="rl5934.tsp">rl5934.tsp</a>
##   - 
##    5934-city TSP (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="si1032.tsp">si1032.tsp</a>
##   - 
##    1032-vertex TSP (M. Hofmeister)
##   <p/>
##  </li>
##  <li>
##   <a href="si175.tsp">si175.tsp</a>
##   - 
##    175-vertex TSP (M. Hofmeister)
##   <p/>
##  </li>
##  <li>
##   <a href="si535.tsp">si535.tsp</a>
##   - 
##    535-vertex TSP (M. Hofmeister)
##   <p/>
##  </li>
##  <li>
##   <a href="st70.tsp">st70.tsp</a>
##   - 
##    70-city problem (Smith/Thompson)
##   <p/>
##  </li>
##  <li>
##   <a href="st70.opt.tour">st70.opt.tour</a>
##   - 
##    Optimum tour for
##   <i>st70</i>
##   <p/>
##  </li>
##  <li>
##   <a href="swiss42.tsp">swiss42.tsp</a>
##   - 
##    42 cities Switzerland (Fricker)
##   <p/>
##  </li>
##  <li>
##   <a href="ts225.tsp">ts225.tsp</a>
##   - 
##    225-city problem (Juenger,Raecke,Tschoecke)
##   <p/>
##  </li>
##  <li>
##   <a href="tsp225.tsp">tsp225.tsp</a>
##   -
##    A TSP problem (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href=".tsp&quot;">225.opt.tour&quot;&gt;</a>
##   - 
##    Optimal solution for
##   <i>tsp225.tsp</i>
##   <p/>
##  </li>
##  <li>
##   <a href="u1060.tsp">u1060.tsp</a>
##   - 
##    Drilling problem (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="u1432.tsp">u1432.tsp</a>
##   - 
##    Drilling problem (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="u159.tsp">u159.tsp</a>
##   - 
##    Drilling problem (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="u1817.tsp">u1817.tsp</a>
##   - 
##    Drilling problem (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="u2152.tsp">u2152.tsp</a>
##   - 
##    Drilling problem (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="u2319.tsp">u2319.tsp</a>
##   - 
##    Drilling problem (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="u574.tsp">u574.tsp</a>
##   - 
##    Drilling problem (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="u724.tsp">u724.tsp</a>
##   - 
##    Drilling problem (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="ulysses16.tsp">ulysses16.tsp</a>
##   - 
##    Odyssey of Ulysses (Groetschel and Padberg)
##   <p/>
##  </li>
##  <li>
##   <a href="ulysses16.opt.tour">ulysses16.opt.tour</a>
##   - 
##    Optimum tour for
##   <i>ulysses16</i>
##   <p/>
##  </li>
##  <li>
##   <a href="ulysses22.tsp">ulysses22.tsp</a>
##   - 
##    Odyssey of Ulysses (Groetschel and Padberg)
##   <p/>
##  </li>
##  <li>
##   <a href="ulysses22.opt.tour">ulysses22.opt.tour</a>
##   - 
##    Optimum tour for
##   <i>ulysses22</i>
##   <p/>
##  </li>
##  <li>
##   <a href="usa13509.tsp">usa13509.tsp</a>
##   - 
##    Cities with population at least 500 in the continental US (David
##    Applegate and Andre Rohe)
##   <p/>
##  </li>
##  <li>
##   <a href="vm1084.tsp">vm1084.tsp</a>
##   - 
##    1084-city problem (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="vm1748.tsp">vm1748.tsp</a>
##   - 
##    1784-city problem (Reinelt)
##   <p/>
##  </li>
##  <li>
##   <a href="xray.problems">xray.problems</a>
##   -
##    Xray crystallography TSP (F. Nielsen, D. Shallcross) 
##    (source code for generator)
##  </li>
## </ul>

Now let us get all anchore elements <a \> in the unstructured list; and examine the first element.

ns = getNodeSet(root, '//ul/li/a') # anchors
e = ns[[1]]
cat("Attributes:", e %>% xmlAttrs(),
    "Value:", e %>% xmlValue() )
## Attributes: a280.tsp Value: a280.tsp

The orignal html/xml code is:

<a href="a280.tsp"> a280.tsp </a>

The attribute is the named variable href with value a280.tsp. The value is coincidently the same in this example.

To obain all the urls from all href attributes we write:

urls = sapply(ns, function(e) str_trim(e %>% xmlAttrs()))
head(urls)
## [1] "a280.tsp"       "a280.opt.tour"  "ali535.tsp"    
## [4] "att48.tsp"      "att48.opt.tour" "att532.tsp"

In a similar way we can obtain the descriptions (explanations,comments, text) for the <li \> tag. Here, we do a few string manipulations to clean the data.

ns = getNodeSet(root, '//body/ul/li') # anchors
expl = sapply(ns, function(e) {
    ce <- e %>% xmlChildren()
    txt = str_trim(ce$text %>% xmlValue())
    txt = str_trim(str_remove(txt,'-'))
    txt = str_trim(str_remove(txt,'\n'))
    if ("i" %in% names(ce)) {txt = paste(txt, ce$i %>% xmlValue())}
    return(txt)
  })
head(expl)
## [1] "Drilling problem (Ludwig)"                      
## [2] "Optimum tour for a280"                          
## [3] "535 Airports around the globe (Padberg/Rinaldi)"
## [4] "48 capitals of the US (Padberg/Rinaldi)"        
## [5] "Optimum solution for att48"                     
## [6] "532-city problem (Padberg/Rinaldi)"

Summary

We showed how to extract data from a website using htmlTreeParse(). A step-by-step drilling into the documents structure was shown using xmlChildren(). Then a shortcut was shown using getNodeSet(), which can take us immediately to the required information. At last we showed a few string manipulations such as str_trim and str_remove to clean the text.

Resources

My personal favourites is Gaston Sanches (2014) systematic, comprehensive and focused tutorial.

Once, the “rough” web scrapping is done. You will find it worthwhile learning regular expressions to speed-up the dat extracting process. This Regular expressions introduction(local copy) is one of the best I discovered so far.

This tutorial is one that you typically find to get you started with R’s web-scraping.

I used RSelenium, which is pretty good, for remote controlling Websites and getting your data, from “protected” sites in an interactive automated way.

By the way, in Power BI you can use GetData >> Web; which is pretty good - if the work contains tables.

Once, you have got all the data extracted. You may need to start with natural language processing (NLP). This link is a good collection of tools.


Cite as

Havard Style:
Garn, W. (2021). Web scraping in R. Available from: https://www.smartana.org/blogs/WebScrapping.html [Accessed 18 March 2021]

APA Style:
Garn, W. (2021, March 18). Web scraping in R [Blog post]. Retrieved from: https://www.smartana.org/blogs/WebScrapping.html