Recently I had the unfortunate displeasure of having to move my vacation dates because of a certain pandemic that, despite the silence of our health authorities, is still ongoing and impacting people.

So instead of taking a crammed bus without ventilation for nine hours, my roommates and I got last minute tickets on the french high-speed railroad train: the Train à Grande Vitesse (High-Speed Train).

Last time I took that train I realized there was something funny that I could play with in the train: its internet connection.

The WiFi

If you are on a TGV that has WiFi you gain access to what the Société Nationale des Chemins de fers Français (SNCF) calls "WiFi SNCF", which is essentially an on-board WiFi service. The SNCF engineered a system of repeaters that allow their trains to maintain a (somewhat) constant connection to a cellular data-like network. It mostly works. Of course, even they recognize and advertise that if you plan to use it for very bandwidth-intensive operations that need a fairly reliable connection, you should just wait. That serves them as well, to be honest, because they have to manage an internal network of many devices all going through a fairly tiny gateway. On my first train that day, we reached more than 240 devices connected in the train.

How do I know that? I did not learn that from a ping scan. The WiFi API told me.

Funny Endpoints

I already knew that there were web API endpoints you could call on the local user-friendly dashboard to the WiFi, accessible on every train at wifi.sncf. Me and a friend planned on trying to inspect as many of them as possible during that trip (to kill time however we could).

Booting up developer tools on our web browsers, we noticed more than a handful of little queries that returned nice-looking JSON-encoded objects, and we dug deeper. What we found was:

  • Trip details and updates
  • WiFi metrics
  • Information used on the dashboard trip map
  • GPS position and heading
  • Various media

So I'm going to go through them from least interesting to most interesting.

The full list of endpoints is:

  • /router/api/bar/attendance
  • /router/api/chat/room
  • /router/api/configuration/modules
  • /router/api/connection/statistics
  • /router/api/media/videos
  • /router/api/poi
  • /router/api/train/coverage
  • /router/api/train/details, or /router/api/train/progress (they are, as far as I can remember, the same)
  • /router/api/train/gps

"Meh" Endpoints

/router/api/chat/room was an endpoint that was unexpectedly disappointing. If memory serves me right, it only provided very simple descriptions of buttons on the on-board assistance chat service, which were not interesting to me. Same goes for /router/api/configuration/modules, I did not even save it, and I cannot remember what its purpose was. Whenever I take a TGV again, I will have to dig again. I am confident, however, that if past me did not save them, then they had a good reason not to do so, as they were probably fairly useless when it comes to extracting data from the portal.

The Bar

You can fetch an endpoint to know whether or not there is currently a queue at the on-train bar! It's given to you by /api/router/bar/attendance, and returns an object with a single boolean called isBarQueueEmpty which tells you whether the bar is busy at the moment or not.

It's effectively useless. I am pretty sure it needs manual update from whoever behind the counter is running the bar, and they probably have other things to worry about.

The Media

This might be the only category where I did not save any example of the JSON output. Essentially, the web portal contains little promotional videos the SNCF made to showcase advice to passengers and such. It's mostly promotional and boring. It has little gems here and there however, but I did not bother viewing it all.

I did, however, bother downloading all 4.3GB of it:

Screenshot of a list of folders, which total content amounts to 4.3GB

Once I'm done inspecting it beyond making sure all files are valid video files and thumbnails, I will think about archiving them on archive.org.

The precise endpoint that delivers all paths and languages available for videos is https://wifi.sncf/router/api/media/videos. Videos are grouped in categories (inoui-coulisses, inoui-idees, inoui-pub, tgv), and available in multiple languages (I found DE, IT, ES, EN and of course FR). Or at least, they should be: from what I saw while inspecting the videos, all of them have audio tracks in french.

WiFi Metrics

In the settings area of the WiFi portal, you have a section that gives you an indication of your signal strength and the count of devices currently connected from anywhere else on the train. Spoiler: it's also pulled from a readily accessible endpoint called /router/api/connection/statistics.

The data it returns is simple:

{
  "quality": 5,
  "devices": 246
}

.quality seems to be an integer on an unknown scale that tells you the quality of the internet access at the moment, and .devices, fairly explicitly, tells you about the total number of devices connected. Because it seems that every single one of them gets a unique /21 subnet on the train, and that inter-subnetwork filtering was in place, I did not manage to find any device on my subnetwork other than myself and the gateway.

I have to commend the SNCF, however, on being able to handle that many machines being connected at the same time. Hopefully the end terminals do not use as much bandwidth as, let's say, a League of Legends player at a tournament event (sigh).

Maps, POIs, GPS

On the WiFi portal, there is a nice little map of your trip. For example, one of my trains had this trip:

Screenshot of the map showing a purple trail from Paris to Rennes with a lot of little drop pins with pictures around the path, and the icon of a train somewhere on it

The map contains several interesting things: POIs of notable sightseeing points around the map, the precise routing scheduled to be taken by the train, all the stations you will go through, and your progress. I will keep the progress for later, because I had quite a lot of fun with it (that was sort of my primary goal ever since I heard about the WiFi API).

First, the API gives you exact positioning when available. The endpoint /router/api/train/gps provides a nice little JSON that looks like this:

{
  "success": true,
  "fix": 8,
  "timestamp": 1692349499,
  "latitude": 45.226723333,
  "longitude": -0.150826667,
  "altitude": 76.72,
  "speed": 83.906,
  "heading": 26.2
}

You've got everything you may want. Exact GPS coordinates (longitude, latitude, altitude), heading, speed, a timestamp, and a GPS fix, which seems to essentially be an identifier for which method was used to measure the position. That detail can help getting a sense of the precision provided at a given time by the measurement method.

At the same time, the map displays a lot of POI icons with landmarks around the area you travel through, and more. All of these POIs are available in /router/api/poi, which returns you a really huge object with entries in this format:

  {
    "id": 20936,
    "title": "Abbaye des Prémontrés en Lorraine",
    "button": "Voir plus",
    "latitude": 48.907502,
    "longitude": 6.057966,
    "image": "/poi/images/20936/small.jpg",
    "isEvent": false,
    "visibleWhenZoomOut": false,
    "article": {
      "header": {
        "header_title": "Abbaye des Prémontrés en Lorraine",
        "image": "/poi/images/20936/big.jpg",
        "legend_image": "Abbaye de Prémontré en Lorraine © Brigitte Merle / Photononstop",
        "department": "France > Grand Est > Meurthe-et-Moselle > Pont-à-Mousson"
      },
      "content": {
        "title_POI": "<p><span><span><span><span><span>Entre <strong>Metz </strong>et <strong>Nancy, en</strong></span></span></span></span></span> Lorraine, la ville de Pont-à-Mousson abrite une abbaye du début du XVIIIe siècle, classée monument historique. L’abbaye des Prémontrés accueille un centre culturel, se visite et propose même des chambres pour y pas [...]",
        "title": "Abbaye des Prémontrés en Lorraine",
        "text": "<br/><br/><h2>Explorez l'abbaye des Prémontés :</h2><h2>L'histoire de l'abbaye</h2><p><span><span><span><span><span>Bâtie au début du XVIIIe siècle, l’abbaye a connu des périodes de faste mais aussi d’abandon. Incendie, révolution, réquisition, encore incendie…elle a été détruite, pillée, vidée, réquisitionnée pour être transformée en hôpital >
      },
      "footer": {
        "author": "Coralie Salvi <br/> Rédaction SNCF Connect",
        "update_article": "2022-10-14"
      }
    }
  },

For those who do not read french, you have: identifier, title, the name of a button to "see more", a small and large thumbnail, and a bunch of HTML to describe the POI (in .content). You also get to see credits for the photographs, and the person who entered this entry into the database of SNCF (thanks Coralie). These POIs can also seemingly be events, which I have never seen, but I can imagine the purpose of, if a commercial event is happening and you want to show it on your map.

As for the thumbnails? Of course I downloaded the whole 35MB or so of them.

Finally, there is /router/api/coverage, which is kind of bizarre. It's a collection of series of GPS coordinates:

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "geometry": {
        "type": "LineString",
        "coordinates": [
          [
            1.454419,
            43.611391
          ],
          [
            1.454473,
            43.611895
          ],
          [
            1.454464,
            43.612209
          ],
          [
            1.454158,
            43.612884
          ],
             ...
        ]
      },
      "properties": {
        "quality": "small"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "LineString",
        "coordinates": [
          [
            1.4384,
            43.635599
          ],
          [
            1.43475,
            43.639033
          ],
          [
            1.434588,
            43.639303
          ],
          [
	      // ...
   	    ]
   	  },
      "properties": {
        "quality": "small"
      }
    }
  ]
}

I will be honest, I don't really know what to do about it. My guess is that it draws the path the train takes on the map? I assume so, at least.

Trip Details

This is the most interesting and exploitable part of the data the API hands to you. My end goal was to produce a tool that could help me track the progress of my train across its trip while remaining in the terminal. Of course, I did it after a ton of jq manipulation. The original payload looks like this:

{
  "number": "8504",
  "carrier": "INOUI",
  "events": [],
  "onboardServices": [
    "OCEHP",
    "OCEVP",
    "OCECM",
    "OCEUB",
    "OCEBA",
    "OCEWF",
    "OCEPP"
  ],
  "additionalServices": {},
  "stops": [
    {
      "code": "FRXYT",
      "label": "Toulouse Matabiau",
      "services": {
        "DRIVER": true
      },
      "coordinates": {
        "latitude": 43.611206,
        "longitude": 1.453616
      },
      "progress": {
        "progressPercentage": 100,
        "traveledDistance": 45628.872278306364,
        "remainingDistance": 0
      },
      "theoricDate": "2023-08-18T06:28:00.000Z",
      "realDate": "2023-08-18T06:28:00.000Z",
      "isRemoved": false,
      "isCreated": false,
      "isDiversion": false,
      "delay": 0,
      "isDelayed": false,
      "duration": 0
    },
    {
      "code": "FRXMW",
      "label": "Montauban Ville Bourbon",
      "services": {
        "DRIVER": true
      },
      "coordinates": {
        "latitude": 44.01444,
        "longitude": 1.341499
      },
      "progress": {
        "progressPercentage": 34.13992504047737,
        "traveledDistance": 20981.52831506503,
        "remainingDistance": 40475.92447719702
      },
      "theoricDate": "2023-08-18T06:54:00.000Z",
      "realDate": "2023-08-18T06:54:00.000Z",
      "isRemoved": false,
      "isCreated": false,
      "isDiversion": false,
      "delay": 0,
      "isDelayed": false,
      "duration": 3
    },
    {
      "code": "FRAGF",
      "label": "Agen",
      "platform": "2",
      "services": {
        "DRIVER": true
      },
      "coordinates": {
        "latitude": 44.207967,
        "longitude": 0.620867
      },
      "progress": {
        "progressPercentage": 0,
        "traveledDistance": 0,
        "remainingDistance": 115660.51721035445
      },
      "theoricDate": "2023-08-18T07:32:00.000Z",
      "realDate": "2023-08-18T07:32:00.000Z",
      "isRemoved": false,
      "isCreated": false,
      "isDiversion": false,
      "delay": 0,
      "isDelayed": false,
      "duration": 3
    },
    {
      "code": "FRBOJ",
      "label": "Bordeaux Saint-Jean",
      "services": {
        "DRIVER": true
      },
      "coordinates": {
        "latitude": 44.825873,
        "longitude": -0.556697
      },
      "progress": {
        "progressPercentage": 0,
        "traveledDistance": 0,
        "remainingDistance": 496038.162231625
      },
      "theoricDate": "2023-08-18T08:40:00.000Z",
      "realDate": "2023-08-18T08:40:00.000Z",
      "isRemoved": false,
      "isCreated": false,
      "isDiversion": false,
      "delay": 0,
      "isDelayed": false,
      "duration": 6
    },
    {
      "code": "FRPMO",
      "label": "Paris - Montparnasse - Hall 1 & 2",
      "services": {
        "DRIVER": true
      },
      "coordinates": {
        "latitude": 48.841172,
        "longitude": 2.320514
      },
      "theoricDate": "2023-08-18T10:52:30.000Z",
      "realDate": "2023-08-18T10:52:30.000Z",
      "isRemoved": false,
      "isCreated": false,
      "isDiversion": false,
      "delay": 0,
      "isDelayed": false,
      "duration": 0
    }
  ],
  "trainId": "882"
}

It's quite intricate, but essentially contains:

  1. Information about services on-board (codes that I haven't cracked in .onboardServices
  2. Information about the various stops in .stops
  3. Lengths and estimated distance across stops (.stops[..].progress)
  4. An identifier for your train (.trainId), mine was 882 that day!

Now, each stop is a little intricate as a data structure. You get the hilarious confirmation in .services.DRIVER that you actually have someone to drive the train at each station, which is cool. There are details about the station, like a unique code, its human-readable name, coordinate, and such. You have date information, the time you were theoretically supposed to arrive at/leave a station, and the actual time you did. Finally, you get some booleans that inform you of changes that might have happened to your original route, and some delay information. For me, when traveling across the segment between Agen and Bordeaux Saint-Jean, the details data suddenly foretold 5 minutes of .delay due to "Large attendance" or something (the station was packed and busy, essentially).

Progress Details

And then we get to the details, which do not work exactly as you would expect. Each stop contains the progress information from its start to the next. You can therefore identify the last stop of your trip not only by the fact it is last in your .stops but also because it does not have any .progress data. You have three values for each segment X -> Y:

  • .traveledDistance, which is the length you have already traveled along the segment, in meters
  • .remainingDistance, the length that remains on the segment, in meters
  • .progressPercentage, the result of (.traveledDistance)/(.traveledDistance + .remainingDistance) * 100

This particular format is probably fairly modular when it comes to suddenly modifying trips on the fly in case a train needs to be rerouted. Since your trip is only divided into segments, you can re-compute the total distance traveled and such from individual segments in case of modification.

It was a bit more painful to script for me however, because that meant a lot of logic had to go into summing up the lengths traveled, the lengths remaining, and compute the final ratio myself

Scripting the Details

Obtaining a percentage of your trip is done by summing up the distance traveled across all segments with the length remaining, and doing a ratio. Once you have a percentage of your position, the rest is simple arithmetic, which leads to this atrocity of a one-liner:

curl -sL "https://wifi.sncf/router/api/train/details" \
	| jq -Mc '.stops | map(select (.progress != null)) | [ (map(.progress.traveledDistance) | add), (map(.progress.remainingDistance) | add)] | if (.[0] + .[1]) then "\(.[0] / (.[0] + .[1]) * 100)%" else "EEE.EE%" end'

which will show you the progress in 3.2f format. Also, yes, that is almost purely arithmetic done in jq. I learned a lot about jq during my train trip.

In Conclusion

It took a while for me to write this. This should come out some time in mid-October, and I was on the train in mid-August. Life's moving fast, I had enough time to even begin my PhD, so that's that.

As to what I take away from this?

  • Nobody is going to bother you for grabbing data that is freely accessible
  • Make weird tools with it! Who cares! Make cute GUIs and TUIs with train emoji 🚂!
  • Document everything you find, and everything you reverse. If it's not useful to you, it could be to someone else.