NYU CUSP Data Visualization - Final Challenge - Junru Lu (lj1230)

In [1]:
from IPython.core.display import HTML, Image

Task 1: Serving Rate

In [2]:
Image("https://lujunru.github.io/images/task1.png")
Out[2]:
  • The average serving rate of whole Manhattan is 45.15%.
  • From 2AM to 5AM, average serving rate is pretty high: > 90%; while in the morning rush hour (7AM - 9AM) and the night rush hour (5PM - 7PM), serving rate drops similarly from over 60% to less than 40%. The rate remains a very low level (average 32%) during the night (10PM - 12AM).
In [3]:
HTML('<iframe width="100%" height="800" frameborder="0" \
      src="https://nyu.carto.com/u/junru/builder/fedb1cb3-e4bc-4e43-bca8-b0ae814197aa/embed" \
      allowfullscreen webkitallowfullscreen mozallowfullscreen oallowfullscreen msallowfullscreen></iframe>')
Out[3]:
  • When we move the time bar, we'll see that the serving rate distribution of all intersections is firstly unifrom during the late night (1AM - 5AM); and then in rest of the day, intersections around lower Manhattan and above 86 Srteet are going to have higher serving rates than other areas.

Task 2: Analysis of Unmatched Trips

In [4]:
HTML('<iframe width="100%" height="800" frameborder="0" \
      src="https://nyu.carto.com/u/junru/builder/59646e9a-f237-4fdd-a8cd-411242fcff41/embed" \
      allowfullscreen webkitallowfullscreen mozallowfullscreen oallowfullscreen msallowfullscreen></iframe>')
Out[4]:
  • The whole picture shows that generally unmatched trips concentrated under 125 street. The more we move south, the denser the unmatched trips.
  • In the late night (1AM - 5AM), unmatched trips concentrated in the lower Manhattan; while in the rush hour (7AM - 9AM & 5PM - 7PM), unmatched trips spread to all over the Manhattan. During the night (10AM - 12AM), the spatial distribution of unmatched trips shrink back to lower and midtown Manhattan. These findings somewhat indicates that the heart of Manhattan is the lower Manhattan area.
  • Take another view from the time: it's hardly to find unmatched trips in the late night. The rest of the day can be divided into two similar parts: 7AM - 16PM and 17PM - 12AM. They follow a same pattern that unmatched trips keep growing during the rush hour (7AM - 9AM & 5PM - 7PM) and decrease after the rush hour. This pattern is quite reasonable since it follows the pattern shown in the above plot.

Task 3: Vehicle Speed

In [5]:
HTML('<iframe width="100%" height="800" frameborder="0" \
      src="https://nyu.carto.com/u/junru/builder/f54b43fe-df1d-4908-8c3f-fa1c9c0e20a8/embed" \
      allowfullscreen webkitallowfullscreen mozallowfullscreen oallowfullscreen msallowfullscreen></iframe>')
Out[5]:
  • This map is made by extracting all running statuses of each vehicle, sorting by time, calculating the time gap and distance between two records and selecting those gaps with over-limit speeds.
  • In general, almost all speeding cases last very short time and happen at the intersections of main roads. Especially on the East Houston St., Park Avenue, Broadway, Adam Clayton Powell Jr. Blvd and Malcolm X Blvd. I think the reason is that drivers tends to run the red lights at the intersection to save time.
  • From the view of intensity, around 55% cases are minor speedings. However, we still have around 750 severe speedings (>= 25 * 1.5).
  • The most frequent speeding drivers are the owners of vehicle 266, 113, 230, 172 and 368. They are more likely to having speedings on East Houston St. and Park Avenue.

Task 4: Number of Passengers >= 5

In [6]:
HTML('<iframe width="100%" height="800" frameborder="0" \
      src="https://nyu.carto.com/u/junru/builder/0176f1d2-c7b7-4a71-982c-b36427c61f4e/embed" \
      allowfullscreen webkitallowfullscreen mozallowfullscreen oallowfullscreen msallowfullscreen></iframe>')
Out[6]:
  • Carto takes very long time to visulize all of the running status of all vehicles during the 24-hour day in one map. That's why I aggregates them into a zip level. The map shows the general rate of over-capacity running is 89%, which means 89% trips are violating. The rate is defined as: the number of over-capacity running status / toatl running status in every zip. For every zip in every time period, I counted the status recording at least 5 passengers as the over-capacity running status and the others as within-capacity running status.
  • As we moving the time bar, we'll see some amazing conclusions. Starting with only 8% in Oct 5th 1AM, the rate keeps growing all day from around 85% in the morning rush hour then to even 100% over the night (7PM - 12AM). I doubt that this may caused by lack of data, as I filled with 0 in rows where at some certain time periods there are only reports about one kind of running status in some zips: either > 5 or <= 4. If there are enought reports for each zip in each time period, my filling compensation should be bery slight. Besides, I don't understand why there will be hunderds of people on one vehicle as shown by the data... However, I kept this result since it can be true if the data is correct.
  • For spatial analysis, since the data may biased, I only use the late night data (1AM - 5PM). The rate distribution shows that most violationg trips take place in the lower Manhattan.

Task 5: Number of Passengers <= 4

In [7]:
HTML('<iframe width="100%" height="800" frameborder="0" \
      src="https://nyu.carto.com/u/junru/builder/e11c12af-b3bd-4fb7-abab-e3aa527a1b68/embed" \
      allowfullscreen webkitallowfullscreen mozallowfullscreen oallowfullscreen msallowfullscreen></iframe>')
Out[7]:
  • Same logic to aggregate on zip level like task 4. The only difference is since the number of passengers are just 0, 1, 2, 3, 4 or 5, I counted categorically other than dichotomously in task 4.
  • The average utilization of ride-hailing vehicles in NYC is 2.10 passengers per trip, or 52.5% in percent (2.13 / 4). And we can see that vehicles running in lower manhattan, both side of central park and above 125 Street are more likely to take higher number of passengers. That's because those are either the busiest areas or main traffic areas of Manhattan.
  • Due to the plenty supply during the late night (1AM - 5AM), the utilization of that time is down to 47.5% (1.9 / 4). The utilization grows to 51% during the morning rush hour (7AM - 9AM) and to 57.3% (2.29 / 4) in the night rush hour (5PM - 7PM). Meanwhile, we shall notice that almost every area are busy during these periods. In the night time (10PM - 12AM), the city are still busy with the utilization keeps at 54.8% (2.19 / 4).

Task 8: Vehicle Activities - Take Vehicle 224 as an example

In [8]:
HTML('<iframe width="100%" height="800" frameborder="0" \
      src="https://nyu.carto.com/u/junru/builder/19a93bd1-201b-4667-a585-9c6b4bf7cbc3/embed" \
      allowfullscreen webkitallowfullscreen mozallowfullscreen oallowfullscreen msallowfullscreen></iframe>')
Out[8]:
  • This animated map shows the activities of vehicle 224. There's some time during the typical day that its trace disappears, which means the vehicle is idling. For the moments it has traces, the faster the vehicle runs the more trips it takes.