General: Facebook VS TikTok

Previously Facebook has been seen in the media due to scandals revolving around their immense data on users. More specifically, the 2018 data breach which saw Cambridge Analytica harvesting 50 million Facebook profiles to deliver personalised ads. This led to a large reform regarding how companies should look after your data and how they do so. However in recent times TikTok has been making the rounds for all manner of issues, the biggest being a fear of data collection via its Chinese owners that caused the president of the United States, Donald Trump, to sign an executive order calling the app to be banned in the US unless partially sold to Microsoft. This is mainly due to the amount of information that TikTok collects on consumer's devices and was recently reverse engineered by a user on reddit who was able to extract part of the source code and inform users on what is actually collected.

As someone who has never quite trusted the data collection of social media, I was curious into what TikTok actually collects as most users of social media are unaware of how much data is actually collected about them. So, I'm going to look at what data TikTok collects vs the other giant in the industry, Facebook. For this I'm going purely on what is created when you register an account on the platform and have accessed it from a mobile device and web browser. I will not be looking at what data is saved when posting or consuming media as this is the point of the platforms, to figure out what interests you. I will also only be consulting sources that can be verified with hard evidence, specifically the Privacy Policy of both Facebook & TikTok and any other relevant media that may help people make an informed decision into whoever they keep using the platform.

Before looking at the differences, let's think about why companies collect your data. One of the reasons companies collect data is to create analytics of how people are using their platform and how they can improve it to increase popularity and retention of existing users. This is done by looking at where it's used most, how often and how long. However, this isn't the main reason. It may seem counter-intuitive for companies not to focus on this as more people mean more money. However, this isn't as important if you don't actually have a method in order to make money. Sites such as Facebook make money by selling your data that you were obviously told would be collected in their highly detailed T&Cs that you read. This is the reason these sites are free; they make money via selling and data and being free means more users. The actual data these companies sell varies dependent on the customer and is often grouped and aggregated. Aggregation is the process of extracting lots of individual data to create anonymous statistics that can help companies make informed decisions such as the proportion of people using Facebook on an iPhone vs an Android, Laptop, etc. This then feeds into systems revolving big data, which sees millions and millions of pieces of information sent into a central hub to create meaningful statistics and data that humans can understand. Now with the understand of why companies collect and sell data, let's look at what they collect.


Hardware Collection

Data collectedFacebookTikTok
Operating SystemXX
Model informationXX
Version of OSXX
Battery LevelX
Mouse MovementsX
LanguageXX
Time zoneXX
Screen Resolution X
Storage Space AvailableX
Audio Settings X
Connected audio devices X

Network Collection

Data collectedFacebookTikTok
IP AddressXX
Phone NumberXX
Mobile CarrierXX
BluetoothX
Nearby WiFi networksX
Connection SpeedX
Info on devices on networkX
IP based locationXX
GPS~X
CookiesXX

Other

Data collectedFacebookTikTok
Mouse movements (Bot detection)X
Keystroke patterns or rhythms X

Now before looking at the actual data, there were a few interesting findings within the privacy policies. First is Facebook. Facebook state that in certain instances such as searching for a profile on Facebook then deleting that search query, this will persist on their servers for up to 6 months, unless you can provide government-issued ID in which the it's for a maximum of 30 days. Upon your death, Facebook has measures in place to delete all information if not requested to be memorialized and appears to happen immediately without keeping the data. In terms of TikTok, it's a little more complicated. Facebook has a privacy policy viewable on facebook.com where as TikTok has multiple policies to cover specific regions of the world.

One interesting note for the US policy is this line.

"We may aggregate or de-identify the information described above. Aggregated or de-identified data is not subject to this Privacy Policy."

In essence, this means that if they collate the data to create datasets which are used for statistics, they follow no rules outlined within the policy and they can do what they wish with it. However, this is only if they aggregate data which is to collate it to create a larger data set that identifies trends rather than individuals, or de-identify data which is where all data that could be used to identify a person has been removed or replacing, such as Name, address, and more. This goes far deeper with different methods to de-identify data but we're not looking that far today.

We are now able to look at this data in more detail. I've split the tables into three distinct categories, Hardware, Software & other. Hardware is more about your device itself and some of its settings, software is similar but more about how the phone connects to you and the internet. Then the "other" table looks at more unique elements which are far more complex than a simple setting. This is intended to give an overview of each of the pieces of software, for further details you can read the rest of the table.

First looking at the hardware table. We can see that both Facebook and TikTok are fairly equal when it comes to hardware settings, TikTok has a small amount of extra data collected due to some audio settings that Facebook passes over, but it also doesn't look at Mouse Movements. This applies more to computers than it does mobile devices, but this isn't that big of a deal. The point of tracking these mouse movements is purely for bot detection, to determine whether a user is someone dragging their mouse across the screen to different buttons or if it's a computer with a program that nearly teleports its mouse to specific locations. The only data Facebook could possibly collect from this is how often you check your friends list or if you prefer to move your mouse while you scroll or not. Simply but, this is harmless and leaves both companies nearly equal in data collection for hardware.


Data collectedFacebookTikTok
Operating SystemXX
Model informationXX
Version of OSXX
Battery LevelX
Mouse MovementsX
LanguageXX
Time zoneXX
Screen Resolution X
Storage Space AvailableX
Audio Settings X
Connected audio devices X

Next up is the software table. Here we see Facebook pulling ahead with further data collection revolving around your Wi-Fi network, specifically, nearby networks, connection speeds, other devices on your network and Bluetooth. The interesting ones are Bluetooth and nearby Wi-Fi networks, this means that Facebook doesn't just look at your device, but the ones around it in a broad term. The tracking of Bluetooth means it can send names of nearby devices that you could potentially connect to on Bluetooth and if on a mobile device, a list of nearby networks. This is slightly more peculiar as this data is extremely broad and could potentially provide hundreds of devices nearby where very little data is known. I expect that this is to give Facebook an idea of how you use Facebook, so whether you have the option of your phone or laptop, which do you pick and whether you browse more from home or more public areas. You can determine the location with things such as the name of your network, being a basic Sky or Virgin Media router, or a public Costa Wi-Fi network. They also talk back devices on your network but this has recently been introduced more into apps for allow for quicker setup times such as YouTube to your TV, instead of slowly typing your password with the remote you can instead visit an appropriate link on your phone and tap in a code to link to your TV and easily share content.

We also need to look at GPS data. Both Facebook and TikTok can ask to look at GPS data but Facebook simply asks for permission whereas TikTok instead states "In certain jurisdictions, we may collect GPS data". Thereby meaning that they may collect your GPS data, but you can't know when. A reddit post that appeared recently talking about a user who was able to reverse engineer the app, stated that TikTok would collect your GPS data every 30 seconds, meaning they could trace your path from one location to another with reasonable accuracy. However, with most devices you are able to block GPS access from that application to your device so it's not as big of a concern as it could be. You can usually do this on iPhone by going to the settings of that app and can individually select what that application gains access to. On Android you should be able to go to app info, permissions and disable specific permissions for that application. But the results for the software table lead to Facebook collecting more data than TikTok.


Data collectedFacebookTikTok
IP AddressXX
Phone NumberXX
Mobile CarrierXX
BluetoothX
Nearby WiFi networksX
Connection SpeedX
Info on devices on networkX
IP based locationXX
GPS~X
CookiesXX

Finally, is the "other" table. There is far less to offer here but I feel it's important to still look at it. As stated previously, Facebook does collect mouse movements on computers, but this is purely for bot detection and cannot be abused to learn more about you, but TikTok stores information on your Keystroke patterns and rhythms. This is basically recording how you touch your screen, similar to mouse movements on a computer. TikTok is one of the few social media sites that acknowledges it collects this data from users. Whereas mouse movements are more basic due to there being an element to separate you from your actions, touch sensors leave no space and potentially mean collecting more identifiable information about you. A post by Janus Kopfstein talked about just this which you can find here. In the post, he talked about Leon Eckert, a researcher at New York University's Interactive Telecommunications Program, who said he was able to build a system that analysed keystrokes on a keyboard and via machine learning was able to determine a person's mood such as if they were happy, sad, and even clinically depressed. So potentially, TikTok could determine your mood just from how you touch your screen. This is a lot of information for a single company to have, knowing not just your interests but also your mood at any given time.


Data collectedFacebookTikTok
Mouse movements (Bot detection)X
Keystroke patterns or rhythms X

Just looking at the tables, both companies are fairly equal on their usage of data collection, except for TikTok' s keystroke analysis. But while this is all official and taken straight from their websites depicting their privacy policy, it's important to look a little further. One element being that TikTok has multiple privacy policies dependent on your country. They have a separate privacy policy for these locations:

- The United States

- European Economic Area, England, Switzerland

- Everywhere else

- Select countries with small additional causes

For a full breakdown of what additional causes apply to which countries, you can view the full list here.


And looking past official sources, a reddit user who goes by the name of "bangorlol" was able to reverse engineer the app and made these discoveries:

- See what apps you have on your phone even if they have been deleted

- Determine if a phone is rooted or jailbroken

- A local proxy on your device with no authentication that would allow them to pretend to be something else e.g. Mail server that could fake emails

- All data was previously sent via HTTP requests (Very bad)

- App shutdown if it could not connect to the analytic server


To explain what some of these mean further, the local proxy is an issue because it means TikTok can turn your phone into a fake server that could send out emails, mislead people and more technical elements that don't need to be discussed now. The point being that if they did use this stuff, there is no reason an anonymous hacker couldn't do the same. Data being sent via HTTP means that it's sent in plaintext. Most websites you visit with have a small lock next to the website's name meaning it uses HTTPS, also known as HTTP Secure. This means data is encrypted and is harder to read, so previously all data would be sent back and anyone could see exactly what data that was such as your login details with your username and password. Finally, is the app shutting down. Nearly all websites use some form of analytics to determine how busy a service is and how people use it. TikTok decide that if you are able to block the connection to this server then you cannot use TikTok, which is rather petty in my opinion. However, the legitimacy of this post can't be entirely verified as TikTok never confirmed any of these features, while major news outlets did run stories on this post. So, it's reliability depends on what you believe. Personally, I think that at least some of this is or was true as this post is over 6 months old at the time of writing.

Finally looking at all this data, I believe that TikTok is far worse than Facebook in terms of data collection. This is due to it collecting more raw data as well as a larger variety and in some instances data that it probably shouldn't be collecting. Also, with its multiple privacy policies that are likely adjusted from their original policy to keep in terms of the law in specific countries doesn't suggest a trustworthy nature. So overall, the data suggests that TikTok is in fact far worse than Facebook in terms of data collection.

If you have any suggestions or improvements for the blog then please send them to [email protected].

29th October 2020

Comments 0

Comments are currently disabled but will be implemented soon.