jandrewrogers 8 years ago

I've designed systems that do this on continental scales (i.e. hundreds of millions of cell phones simultaneously, in real-time). The devil is in the details and non-trivial; this is not a "an intern and 6 months" job. Mobile telemetry is not nearly as ideal in practice as assumed here and it typically takes a couple years to learn how to handle the numerous peculiar artifacts of that data that will damage the quality of a naive implementation. Reconstructing a model of the population from the cleaned data that approximates the ground truthing is surprisingly difficult and requires quite a bit of clever data science and maths.

It takes a lot of work and expertise to build a population model from mobile telemetry that approximately reflects reality. Far fewer people know how to do this well than you might assume by looking at the requirements for a naive implementation. Even most mobile carriers have limited ability.

  • frandroid 8 years ago

    Have you posted about this at length somewhere? If not, care to elaborate on what it took to design this system?

    • jandrewrogers 8 years ago

      I have not written about it. Most of the difficulty and complexity, from my perspective, is in the data science and processing required to construct an accurate population model, which requires additional data sources beyond the mobile telemetry. I designed the custom database platforms (easy for me) underneath which supported the online data processing.

      It isn't that difficult technically, if you have experts doing it, it just requires far more domain expertise to do correctly than I think people expect. You also need to be willing to write some of your own tooling to deal with the data efficiently and effectively.

      • dajohnson89 8 years ago

        I'm gonna take a wild guess, and say that the NSA has a monopoly on the talent for this field.

        • rl3 8 years ago

          The Snowden leaks confirmed NSA has the ability to conduct co-traveler inference.[0] In other words: finding mobile devices proximate to a targeted mobile device, based on similar vectors. Perhaps even making associations in absence of targeting via patterns in device proximity over time.

          It probably gets real interesting when they're trying to distinguish between various modes of transit, such as a city bus, an Uber/Lyft/taxi, and a private vehicle not participating in rideshares. Of those examples, the latter would suggest the highest degree of association.

          Pure speculation, but I wouldn't doubt they take a peek at ridesharing data for co-traveler inference purposes. Knowing if a rideshare driver is on or off the clock would be incredibly valuable information in that context.

          [0] https://www.washingtonpost.com/apps/g/page/world/how-the-nsa...

          • jandrewrogers 8 years ago

            Graph reconstruction from space-time event data goes far beyond the above in terms of capability. You can infer relationships between people that never co-travel, infer that people have been places that are not in the event data, etc by stitching together large numbers of orthogonal event streams over long periods of time. It is straightforward to distinguish between various modes of transit analytically. The "metadata" that simply indicates an event in space-time is far more valuable analytically than the data because it is possible to reconstruct so much with it that isn't contained within the data per se.

            I was doing all of this five years ago, the capability has been around for a while.

            • pjc50 8 years ago

              Do you ever worry that this data might be used for bad purposes?

        • user5994461 8 years ago

          Mapping is wide and common industry, just like web or finance.

          The NSA only recruits in the USA. It's a fraction of the talent pool of the planet.

          • compuguy 8 years ago

            Its actually smaller. At minimum you have to be a naturalized or native born citizen of the USA for many jobs in the US Government. For the NSA, add the requirement of having a current clearance or the time to get one (worst case years).

          • cmahler7 8 years ago

            NSA violate the constitution on a daily basis, I'm sure they have a loophole to get whatever talent they need

        • frankydp 8 years ago

          Most of this talent is in the private sector. Primarily around traffic.

        • jandrewrogers 8 years ago

          The handful of people I know that are real experts at the data science are all in the private sector.

      • user5994461 8 years ago

        A great PoC is fairly doable by an intern.

        Increasing the precision by tenfold will likely increase the effort by a hundred fold or more. Just because it can be made harder and more expensive doesn't mean it has to be.

        At the end of the day, a bit of precision doesn't change the nature of an effective planet scale mass surveillance system.

      • awkwarddaturtle 8 years ago

        > which requires additional data sources beyond the mobile telemetry.

        So it really isn't technically difficult. It's just lacking in data?

        In your original post, you made it seem like it was extremely difficult. From my perspective, it seems like child's play from a technical standpoint.

        > It isn't that difficult technically, if you have experts doing it

        But your top comment implied it was extremely technically difficult.

        • jandrewrogers 8 years ago

          The are some things that are technically very difficult if you have no domain expertise, which applies to the subject matter. Most people that try to do this without experience fail in practice, it takes a lot of time and effort to become competent at it, but once you figure it out it is repeatable without too much effort.

          There is a much smaller set of things that are technically difficult to execute even if you are highly experienced at doing it -- each time is a challenge. This is not one of those cases, it just has a severe learning curve.

  • zkms 8 years ago

    > the numerous peculiar artifacts of that data that will damage the quality of a naive implementation.

    Do you have examples (or a link/reference to something that has such examples) of those types of artifacts?

  • pdoege 8 years ago

    Carrier iQ demonstrated GPS level tracking of 100m+ phones nearly 10 years ago.

strictnein 8 years ago

This article finally answered a question I've had for a while: how they can do decent triangulation with just two towers.

> "We said that a tower covers a radius around it. In practice, this is sub optimal so that’s not how it’s done.

> Instead, a station is usually split in 3 independent beams of 120 degrees."

So it's not the intersection of two circles anymore, it's the intersection of two arcs, which will likely only have one intersection point, unlike circles.

  • mostlyskeptical 8 years ago

    A step beyond that: some can get fairly accurate with a single "tower". We had at minimum 3 BTS on every site, generally at 120 degree spacing but this could vary, but each BTS had multiple antennas and would do some rudimentary triangulation based on signal arrival times to each antenna. We could generally get within a couple of hundred fee.

Rjevski 8 years ago

Note that a lot of the information from the BTS is already available to anyone who "asks nicely".

The mechanism that provides roaming is based on trust, so anyone connected to the SS7 network can query the location of any phone in the world and even intercept its calls. Just say to the home carrier "hey this phone is roaming on my network, would you be able to send me all of its calls and texts?".

  • nawtacawp 8 years ago

    There was a talk/demo I saw a few years ago that went into great detail about how this works. I remember it was given by a German. Anyone know what I am talking about?

    Edit: It was a video.

  • mi100hael 8 years ago

    Which is also why SMS makes a poor authentication factor.

  • rsync 8 years ago

    "The mechanism that provides roaming is based on trust, so anyone connected to the SS7 network can query the location of any phone in the world and even intercept its calls."

    One of the first things I did after opening the article was to search for the string "ss7" ... was disappointed to see it mentioned zero times ...

warrenm 8 years ago

The phone companies already do this, more or less, as is shown in court cases where cell phone records are brought in as evidence

A decade ago that data was a little more iffy (i.e. it was more a good estimate (typically within half a mile or less) than a true location), but with a combination of more towers (and therefore more data points), the ubiquity of smartphones (which check in more often, are doing geolocation related things, etc), and better / more accessible/well-known analytics tools, is think even 6 months would be a generous time-frame

  • jimktrains2 8 years ago

    > The phone companies already do this, more or less, as is shown in court cases where cell phone records are brought in as evidence

    You can also arrange to buy this information. I worked for a place where you could request someone's location by phone number. There were a lot of contractual obligations around us having the phone owner "allow" us to do that, but no technical ones.

    • EdwinHoksberg 8 years ago

      How did you access the data? Via a simple REST api?

      • jimktrains2 8 years ago

        > How did you access the data?

        We signed a contract, fulfulled our obligations, and paid them money

        > Via a simple REST api?

        I really don't remember. It might have been SOAP or something. It was an HTTP-based API, but I don't think it was REST specifically.

        There was also a 30s or so delay from request to when we'd get the location back.

    • jliptzin 8 years ago

      What's the accuracy on that location information? Is it down to the meter/10 meter/100 meter?

      • jimktrains2 8 years ago

        It wasn't GPS accurate, but accurate enough for our usage (monitoring volunteers). I want to say it was under 10m and over 5m on average. It's been about 6 years since I've dealt with the system, so the details are a bit fuzzy.

    • Jonnax 8 years ago

      Was it a phone operator specific system? Like could you get the location of any phone number or only AT&T's, for example.

      • jimktrains2 8 years ago

        I believe our provider had contracts with the big 4 in the US.

    • monksy 8 years ago

      Yep this is more than possible. It's done in marketing. If you have a short code to "text for coupons" There's a high chance that they're doing a ping against your number.

    • rasz 8 years ago

      repo men usually have someone able to do this for $50-200

contingencies 8 years ago

Q. What Does It Really Take to Track a Million Cell Phones?

A. Sell outsourced billing solutions to the mobile carrier. (See AMDOCS)

frankydp 8 years ago

Inrix, TOMTOM, and a couple other have been providing this data as a product for at least 2 decades. There was an early provider that lead the space, but the name of that company eludes me at the moment, may have been actually purchased by inrix.

Most of those companies focused on 10m+- resolution and focused on path data to build traffic speed data for local news companies.

Only cost a couple million bucks and an extensive partnership agreement to get into the space.

There is a lot of data washing in those agreements, mostly related to preventing reverse identification.

Airsage has taken it to the next level in the more recent past with GPS based anonymized data, but data with EXTENSIVE history. The Airsage product is zip code and smaller resolution and can provide months to years of location history of an anonymous cell phone id.

mikhailfranco 8 years ago

To answer the 'Call for comment' about intersecting complex shapes... one simple, fast, general, approximate, discrete method is to use OpenGL to get your GPU to do it for you. Just render the shapes into an off-screen framebuffer, using appropriate logic ops or stencil planes, then read back the final buffer to get a bitmask of the possible positions. To reduce to one estimate of position, find the centroid of the largest contiguous pixel group (flood-fill different seed ids; histogram pixels; select region id with highest count).

harlanji 8 years ago

I did the math a while back, don't have the notes at the moment, but scaling an AWS system I built enough to collect 600m points of data each minute and compute on data within 100ms and retain it for a few minutes would run a bit over $10k usd/mo to operate. I operated it at about 3m events/min with a good amount of compute per including ip to geo lookup... Zookeeper would be the only bottleneck in this case assuming good enough partitioning.

  • tinix 8 years ago

    Using AWS is the problem here, and that's why it's so expensive. You could do this on bare metal WAY faster and more efficiently, and then you own the hardware forever, for the price you paid to do it for a month with a third party.

    AWS does not scale this way, you can't just throw more resources at a problem and expect to be profitable.

    • user5994461 8 years ago

      Put everything in Google BigQuery or Google DataProc.

      Cheaper than both and hardly any maintenance required.

    • harlanji 8 years ago

      Agreed it could be done cheaper over long term, just wanted to share about an actual prod system. This also had 3x replication via Kafka to avoid stampedes etc if anything failed and keep going with an at-least-once guarantee.

      In my opinion tho, even that price point is pretty accessible to keep tabs on all citizens with that resolution which was my hypothetical case.

    • greenleafjacob 8 years ago

      You would own the hardware until it died which is not forever.

      • tinix 8 years ago

        Just because it quits working doesn't mean you don't still own it. Might wanna lay off the green leaf, bro. Hahaha

losteverything 8 years ago

If my phone is powered off can i be tracked?

What if i remove the battery?

  • sillysaurus3 8 years ago

    Yes, if you're in a city you're tracked constantly by dragnet surveillance.

    Questions like this one aren't very useful without a threat model. Who are you trying to prevent tracking you? If it's just your phone carrier then obviously turning off your phone and removing the battery will render it inoperable. But now you don't have a phone, and your location info wasn't very useful to begin with anyway unless you were involved in an operation where you need to conceal your location.

    • losteverything 8 years ago

      This goes into I Don't Know What I Don't Know Dept. (sorry Mad magazine)

      A patron was telling me that the way the GPS is so accurate is because it uses the phones radio... Didnt know that either. (i mentioned to him that there is one spot in MA where our google directions are off by 1/2 mile.. Same place every trip.)

      • Jonnax 8 years ago

        It's called Assisted GPS: https://en.wikipedia.org/wiki/Assisted_GPS

        It speeds up getting an accurate location, doesn't provide a more accurate location than GPS.

        GPS on is generally accurate to around 5 or 6 metres. That's the technology on it's own.

  • gvb 8 years ago

    If your phone is powered off or in airplane mode it is not supposed to emit RF and thus cannot be tracked. This is a matter of trust, so if your threat model includes high end threats, the assumption that it follows the normal requirements may be invalid.

    If you remove the battery, it will be unpowered and unable to emit RF and thus cannot be tracked. While it is theoretically possible to hide an auxiliary battery in your phone, that would be very hard to achieve, especially in modern thin phones. If your threat model includes highly motivated state sponsored actors, this is could be achieved.

    If you put your phone in a RF-tight enclosure (e.g. metal box), the RF energy cannot get out and thus it cannot be tracked.

    • aembleton 8 years ago

      Modern thin phones don't let you remove the battery, so they could easily continue to transmit RF.

  • ww520 8 years ago

    You can put the phone inside a RF radio blocking bag.

    • dboreham 8 years ago

      The bag is only going to attenuate any signal, not block it (block would imply infinite attenuation). Whether or not the attenuation provided is sufficient to prevent an adversary from receiving the signal I'm not sure. I definitely wouldn't bet my life on it. I'd want a pretty thick metal box with proper seam gaskets as a minimum.

liprais 8 years ago

this method will only work with GSM network because 1.GSM networks doesn't verify BTS 2.GSM encrypt keys are cracked and all over the internet. Users of other kind of networks should not worry about this kind of hack. Actually here in China a fake BTS a.k.a 伪基站 can be easily purchased online.

  • rsync 8 years ago

    "this method will only work with GSM network because ..."

    Yes, that's true - but remember that all of our 3G/4G phones are also 2G phones and that if you disable/jam/overpower the 3G/4G signals the phone will very happily revert down to 2G, possibly with no encryption, and possibly in a way that you have to be very careful to even notice.

    There are quite a few attacks that are mitigated by 3G/4G in theory, but in practice you're still vulnerable to because your phone can be downgraded to 2G by an outside actor.

    • girvo 8 years ago

      Interestingly, the 2G networks are being (or have? I can't remember which) shut down entirely here in Australia.

    • user5994461 8 years ago

      It works on all generations: 2G 3G 3.5G 4G LTE.

  • privong 8 years ago

    > Users of other kind of networks should not worry about this kind of hack.

    The article is not about a hack. The article is about how the cell company or state-level actor can leverage the connectivity information that is required for any modern cell service to operate.

TACIXAT 8 years ago

I was hoping there would be some information in here about what cell phones leak that a third party could pick up on. For example, tracking the mac address in beacon packets, or the cell frequency equivalent of that. Of course if you can hook into the base stations you can track them.

eleitl 8 years ago

Now you know why my Nokia 3310 is switched off most of the time.

draw_down 8 years ago

Nothing worthwhile ever takes an intern and six months. Ever.

  • tripzilch 8 years ago

    One has to wonder why interns even bother. /s

devrandomguy 8 years ago

A: A deeply sociopathic mindset. See the requirements section for details.

  • etiam 8 years ago

    I have no intention of contradicting that, but if so, that makes it all the more disturbing the organizations where this if not only standard operating procedure but just one tool among many similar ones.

  • thinkfurther 8 years ago

    This is "off-topic" to the attention span of HN. I recently realized this when someone mentioned corruption as the main problem of some issue; yeah, that's "very general", but nevermind programmers, not even a power user would keep looking for program bugs or worry about the order they do things in when it's already been confirmed that the memory or PSU or something like that is faulty. Anyone worth their salt would stop those other debug activities to focus on correcting that, while someone who absorbed these broken parts and/or their acceptance as part of their synthetic identity will do anything but that.

    Also see Hannah Arendt, Erich Fromm, et al. This other mediocre shit? This being a "hacker" in a goldfish bowl? That's for those who can't hack the adult responsibilities of the 20th and 21th centuries. those who fell asleep, those who already fell off. They will downvote you today and look the other way as drones take care of you tomorrow, don't hold your breath for anything else. Anything else, any future worth a fuck, has to be done despite their wishes, or rather, despite where they are drifting.

trekking101 8 years ago

Somebody please explain this line from the post:

Radio waves travel at the speed of light 299 792 458 m/s.

  • calebm 8 years ago

    Both radio waves and light are just differing frequencies of Electromagnetic Radiation.

    • trekking101 8 years ago

      GBNST (guilt by non-scientific thinking): Radio waves = sound = speed of sound, therefore wtf sound = light, but now I 'see the light.'

      After having a second cup of coffee I did a doh! and realized conflating 'radio' with sound is non-sensical, but I wonder if I'm in the minority thinking this way. Or maybe it's just my non-tech background!

  • officialjunk 8 years ago

    When you turn on a lightbulb, the light coming from the bulb travels at the speed of light, which is 299,792,458 meters per second. Radio waves also travel at this same speed, since it is also light.

    • sillysaurus3 8 years ago

      It's worth mentioning that neither light nor radio waves travel at 299,792,458 m/s through atmosphere. That's the speed of light in a vacuum.

      An interesting question is whether radio waves, gamma radiation, and visible light all travel an identical speed through atmosphere.

      The reason light slows down in atmosphere is because it hits atoms. It travels between each atom at the speed of light, but when it reaches an atom the radiation is absorbed and re-emitted, which introduces a delay. So the question that I'm wondering is: do different frequencies of radiation get absorbed and re-emitted at the same rate as every other frequency? That would give it identical speed. But if the absorption is different then presumably the speed would also be different.

      • smeyer 8 years ago

        > An interesting question is whether radio waves, gamma radiation, and visible light all travel an identical speed through atmosphere.

        They don't. The index of refraction tells you about how the speed of light is changed by a medium, and the fact that it's different for different colors of visible light is why you get effects like rainbows.

        This stack exchange question might interest you if you'd like to read more: https://physics.stackexchange.com/questions/196803/why-is-th... .