【NYC OPENDATA】抓取MTA(纽约地铁)实时列车数据

今天偶然看到以前的一个关于大数据的TED演讲[Bilibili][如何找到纽约市最差的停车位——用大数据说话]*,刚好是关于纽约市的。里面提到了2012年纽约市长彭博签署了一个开放数据法案(OpenData Legislation),从此有了NYC OpenData门户网站,并据说开放了近1000个数据库。那个TED演讲是15年2月的了,至今已经4年,有啥变化呢现在。

好奇心驱使下,我来到了NYC OpenData的门户网站,首先搜的就是MTA的数据。为啥呢,因为我们这边有款第三方APP叫transit,能看到纽约地铁和巴士的实时位置,当时一开始用的时候很震惊,就已经隐约觉得MTA应该是提供了一个实时数据的API,但是是私有还是公开就不确定了。

门户网站上查到了多个关于纽约地铁的API,其中MTA Data是MTA的实时数据。

这里点进去是跳转到MTA的官网并要求填写基本信息,实测填写基本信息并验证邮箱后就能获得APIKey

之后就能到达Feed Documention页面,页面上基本可以知道就是MTA的这个实时数据是使用谷歌的GTFS标准,而根据谷歌的文档,GTFS又是基于Protocol Buffers序列化数据结构。

这里我用NodeJS进行数据收集,所以需要用到protobuf.js这个NodeJS的ProtocolBuffers组件对数据进行解码。

Feed API列表

获取到我们的APIKey之后,因为接口返回的是GTFS数据,所以我们无法直接用浏览器去预览。所以就要愉悦地运行我们的npm init 去新建项目了

数据下载我们直接用request组件即可。

引入protobufjs组件后,直接使用其load方法加载MTA官网上提供的Proto模板[nyct-subway.proto]文件,记得要在谷歌的官方GTFS模板[gtfs-realtime.proto]也下了和nyct-subway.proto放在同一目录下

代码如下

ProtoBuf.load("./protobuf/nyct-subway.proto", function(err, root) {
	if (err)         
		throw err; 
});

然后使用root的lookupType方法去创建一个叫FeedMessage的数据类型,这个数据类型是在GTFS的Proto里定义的。

ProtoBuf.load("./protobuf/nyct-subway.proto", function(err, root) {
    if (err)
        throw err;

    console.log(root)
    let FeedMessage = root.lookupType("transit_realtime.FeedMessage");
});

然后就可以使用这个类型的decode 方法对下载下来的内容进行解码,注意decode 方法只能接受Buffer类型的数据,所以在request的时候要设定encoding为null,他才会返回一个Buffer类型数据,否则一律返回utf-8编码后的数据。

FeedMessage.decode(buffer);

这样就能获得Object类型的数据了。

{ "header": { "gtfsRealtimeVersion": "1.0", "timestamp": "1568241718", ".nyctFeedHeader": { "nyctSubwayVersion": "1.0", "tripReplacementPeriod": [{ "routeId": "B", "replacementPeriod": { "end": "1568243518" } }, { "routeId": "D", "replacementPeriod": { "end": "1568243518" } }, { "routeId": "F", "replacementPeriod": { "end": "1568243518" } }, { "routeId": "M", "replacementPeriod": { "end": "1568243517" } } ] } }, "entity": [{ "id": "000061M", "tripUpdate": { "trip": { "tripId": "117250_M..S", "startTime": "19:32:30", "startDate": "20190911", "routeId": "M", ".nyctTripDescriptor": { "trainId": "1M 1932+ CTL/MET", "direction": "SOUTH" } }, "stopTimeUpdate": [{ "arrival": { "time": "1568244750" }, "departure": { "time": "1568244750" }, "stopId": "G08S", ".nyctStopTimeUpdate": { "scheduledTrack": "D1", "actualTrack": "D1" } }, { "arrival": { "time": "1568244840" }, "departure": { "time": "1568244840" }, "stopId": "G09S", ".nyctStopTimeUpdate": { "scheduledTrack": "D1", "actualTrack": "D1" } }, { "arrival": { "time": "1568244960" }, "departure": { "time": "1568244960" }, "stopId": "G10S", ".nyctStopTimeUpdate": { "scheduledTrack": "D1", "actualTrack": "D1" } }, { "arrival": { "time": "1568245020" }, "departure": { "time": "1568245020" }, "stopId": "G11S", ".nyctStopTimeUpdate": { "scheduledTrack": "D1", "actualTrack": "D1" } }, { "arrival": { "time": "1568245140" }, "departure": { "time": "1568245140" }, "stopId": "G12S", ".nyctStopTimeUpdate": { "scheduledTrack": "D1", "actualTrack": "D1" } }, { "arrival": { "time": "1568245230" }, "departure": { "time": "1568245230" }, "stopId": "G13S", ".nyctStopTimeUpdate": { "scheduledTrack": "D1", "actualTrack": "D1" } }, { "arrival": { "time": "1568245320" }, "departure": { "time": "1568245320" }, "stopId": "G14S", ".nyctStopTimeUpdate": { "scheduledTrack": "D1", "actualTrack": "D1" } }, { "arrival": { "time": "1568245410" }, "departure": { "time": "1568245410" }, "stopId": "G15S", ".nyctStopTimeUpdate": { "scheduledTrack": "D1", "actualTrack": "D1" } }, { "arrival": { "time": "1568245500" }, "departure": { "time": "1568245500" }, "stopId": "G16S", ".nyctStopTimeUpdate": { "scheduledTrack": "D1", "actualTrack": "D1" } }, { "arrival": { "time": "1568245590" }, "departure": { "time": "1568245590" }, "stopId": "G18S", ".nyctStopTimeUpdate": { "scheduledTrack": "D1", "actualTrack": "D1" } }, { "arrival": { "time": "1568245680" }, "departure": { "time": "1568245680" }, "stopId": "G19S", ".nyctStopTimeUpdate": { "scheduledTrack": "D1", "actualTrack": "D1" } }, { "arrival": { "time": "1568245800" }, "departure": { "time": "1568245800" }, "stopId": "G20S", ".nyctStopTimeUpdate": { "scheduledTrack": "D1", "actualTrack": "D1" } }, { "arrival": { "time": "1568245920" }, "departure": { "time": "1568245920" }, "stopId": "G21S", ".nyctStopTimeUpdate": { "scheduledTrack": "D1", "actualTrack": "D1" } }, { "arrival": { "time": "1568246010" }, "departure": { "time": "1568246010" }, "stopId": "F09S", ".nyctStopTimeUpdate": { "scheduledTrack": "D3", "actualTrack": "D3" } }, { "arrival": { "time": "1568246190" }, "departure": { "time": "1568246190" }, "stopId": "F11S", ".nyctStopTimeUpdate": { "scheduledTrack": "D3", "actualTrack": "D3" } }, { "arrival": { "time": "1568246280" }, "departure": { "time": "1568246280" }, "stopId": "F12S", ".nyctStopTimeUpdate": { "scheduledTrack": "D3", "actualTrack": "D3" } }, { "arrival": { "time": "1568246400" }, "departure": { "time": "1568246400" }, "stopId": "D15S", ".nyctStopTimeUpdate": { "scheduledTrack": "B1", "actualTrack": "B1" } }, { "arrival": { "time": "1568246490" }, "departure": { "time": "1568246490" }, "stopId": "D16S", ".nyctStopTimeUpdate": { "scheduledTrack": "B1", "actualTrack": "B1" } }, { "arrival": { "time": "1568246580" }, "departure": { "time": "1568246580" }, "stopId": "D17S", ".nyctStopTimeUpdate": { "scheduledTrack": "B1", "actualTrack": "B1" } }, { "arrival": { "time": "1568246670" }, "departure": { "time": "1568246670" }, "stopId": "D18S", ".nyctStopTimeUpdate": { "scheduledTrack": "B1", "actualTrack": "B1" } }, { "arrival": { "time": "1568246760" }, "departure": { "time": "1568246760" }, "stopId": "D19S", ".nyctStopTimeUpdate": { "scheduledTrack": "B1", "actualTrack": "B1" } }, { "arrival": { "time": "1568246850" }, "departure": { "time": "1568246850" }, "stopId": "D20S", ".nyctStopTimeUpdate": { "scheduledTrack": "B1", "actualTrack": "B1" } }, { "arrival": { "time": "1568247000" }, "departure": { "time": "1568247000" }, "stopId": "D21S", ".nyctStopTimeUpdate": { "scheduledTrack": "B1", "actualTrack": "B1" } }, { "arrival": { "time": "1568247180" }, "departure": { "time": "1568247180" }, "stopId": "M18S", ".nyctStopTimeUpdate": { "scheduledTrack": "J1", "actualTrack": "J1" } }, { "arrival": { "time": "1568247630" }, "departure": { "time": "1568247630" }, "stopId": "M16S", ".nyctStopTimeUpdate": { "scheduledTrack": "J1", "actualTrack": "J1" } }, { "arrival": { "time": "1568247690" }, "departure": { "time": "1568247690" }, "stopId": "M14S", ".nyctStopTimeUpdate": { "scheduledTrack": "J1", "actualTrack": "J1" } }, { "arrival": { "time": "1568247780" }, "departure": { "time": "1568247780" }, "stopId": "M13S", ".nyctStopTimeUpdate": { "scheduledTrack": "J1", "actualTrack": "J1" } }, { "arrival": { "time": "1568247870" }, "departure": { "time": "1568247870" }, "stopId": "M12S", ".nyctStopTimeUpdate": { "scheduledTrack": "J1", "actualTrack": "J1" } }, { "arrival": { "time": "1568247930" }, "departure": { "time": "1568247930" }, "stopId": "M11S", ".nyctStopTimeUpdate": { "scheduledTrack": "J1", "actualTrack": "J1" } }, { "arrival": { "time": "1568248080" }, "departure": { "time": "1568248080" }, "stopId": "M10S", ".nyctStopTimeUpdate": { "scheduledTrack": "M1", "actualTrack": "M1" } }, { "arrival": { "time": "1568248170" }, "departure": { "time": "1568248170" }, "stopId": "M09S", ".nyctStopTimeUpdate": { "scheduledTrack": "M1", "actualTrack": "M1" } }, { "arrival": { "time": "1568248410" }, "departure": { "time": "1568248410" }, "stopId": "M08S", ".nyctStopTimeUpdate": { "scheduledTrack": "M1", "actualTrack": "M1" } }, { "arrival": { "time": "1568248620" }, "departure": { "time": "1568248620" }, "stopId": "M06S", ".nyctStopTimeUpdate": { "scheduledTrack": "M1", "actualTrack": "M1" } }, { "arrival": { "time": "1568248680" }, "departure": { "time": "1568248680" }, "stopId": "M05S", ".nyctStopTimeUpdate": { "scheduledTrack": "M1", "actualTrack": "M1" } }, { "arrival": { "time": "1568248770" }, "departure": { "time": "1568248770" }, "stopId": "M04S", ".nyctStopTimeUpdate": { "scheduledTrack": "M1", "actualTrack": "M1" } }, { "arrival": { "time": "1568248890" }, "departure": { "time": "1568248890" }, "stopId": "M01S", ".nyctStopTimeUpdate": { "scheduledTrack": "M1", "actualTrack": "M1" } } ] } }] }

(以上数据因数量过多已经删减)

4 条评论

昵称

此站点使用Akismet来减少垃圾评论。了解我们如何处理您的评论数据

  1. saber

    sakura,樱,背景是工口先生~OωO

    1. Y2Nk4

      OωO GHS?

  2. repostone

    非技术的路过。

    1. Y2Nk4

      OωO