Skip to content
Snippets Groups Projects
Forked from Lodewijk Loos / tiktoktoad
Up to date with the upstream repository.
user avatar
Lodewijk Loos authored
4a68846a
History
Name Last commit Last update
webext
webservice
LICENSE
README.md

Toad icon

Tiktoktoad

This project is a logger for the TikTok web version. It intercepts some of the communication between TikTok and your browser. This data (which videos were recommended to you, which usage data was sent) is stored in a database on your computer in order to be able to analyse it. By doing this we hope to get some insight in the algorithms TikTok uses

This project uses:

  • a web extension as 'Temporay Add-on' (not released in Firefox Add-ons store). The web extension runs in development mode.
  • a local web service using Ruby/Sintra.

how it works

The web extension intercepts traffic (mainly xmlhttprequests) between TikTok and the browser. For example, every click on a video or other interaction with TikTok results in an notification to the TikTok servers, letting them know what happened in detail. When you scroll down or click around new video's are recommended to you. The web extension intercepts these events, recommends and their labels and stores them in a structured database by sending them to the prementioned web service that runs locally on your computer. After having gathered data by 'just browsing' analysis can be done. The reason for having this web service is that the web extension itself cannot write directly to your hard drive. N.B: this project is all very much work in progress. However it works

usage

Every time the browser runs via the web-ext command line a new session is started with all browser caches and cookies cleared. Giving you a fresh start as brand new anonymous user each time. All events and recommends are linked in the database to this session (having an unique id) in order to be able to keep them apart.
As soon as you log in using a TikTok account the username from your account will be used instead of the generated session id.
The little toad icon top right in the toolbar allows for giving tags that get linked to the current session in order to keep annotations for yourself.
Note: the web extension can run without the web service. The visualisation will work, but nothing is stored.

visualisation

When the extension is started it opens a second browser tab showing labels and tags bound to the videos in the current session. This is just for illustration of the idea that it could be interesting to give direct visual feedback on your TikTok browsing.

Installation

Get Firefox

This extension works in Firefox (and possibly some other browsers). https://www.mozilla.org/en-US/firefox/new/

Get the code

$ git clone https://gitlab.waag.org/lodewijk/tiktoktoad.git

Install the web extension

  • Get node.js/npm (via installer https://nodejs.org/en/) or your favourite package manager.
  • Install web-ext command line tool
    # tested with node v16.13.2, npm 8.1.2
    $ sudo npm install --global web-ext

more info on web-ext: https://github.com/mozilla/web-ext

Install the Ruby web service

  • Note: this might just work with default ruby/gem/sqlite3 versions on MacOS. Tested with:

    • ruby 2.6.3p62
    • gem 3.0.3
    • sqlite3 3.32.3

    Otherwise get ruby via your favourite package manager or RVM https://rvm.io (rvm install ruby-2.6) and install sqlite3 using a package manager.

  • Install Ruby dependencies

$ gem install sequel
$ gem install sqlite3
$ gem install sinatra

Run from terminal

run the web service for logging to our database

# in a seperate terminal tab or window
tiktoktoad/webservice$ ruby log_service.rb

run the Firefox with our web extension

tiktoktoad/webext$ web-ext run --start-url https://www.tiktok.com/

Code

web extension

The web extension consists of the following scripts that interact with each other via the messaging system.

background script

The background script (background.js) intercepts xmlhttprequests between the browser and the TikTok servers.

Furthermore the background script takes care of the communication with the web service, which functions as a logger and stores the intercepted data in a structured database.
The background script also function as a central hub for the other scripts. It feeds the visualisation script and it gets input from the menu script and the content script.

content script

The content script is mainly used for getting the initial application state. The state lives encoded in the page source. The background script could also access this source. However when doing this from the content script we can make sure the state was already processed by the TikTok web application and access it as a normal JS object. N.B. while developing this project we encounterd two types of application states: __NEXT_DATA__ (apparently NextJS) and SIGI_STATE .
The information on the currently logged in user is also in the application state.

menu script

The menu script handles the form in the pop-up menu, which allows the user to add annotations to the browsing session.

visualisation

The visualisation is just a proof of concept. The script receives messages from the background script and shows all tags and labels encountered while browsing. A new column is started after a page refresh. Note that labels (diversification labels) are only available for the first videos (initial state) on the page.

web service

The web service receives JSON posts from the background script. It processes the events and recommends and puts them in a structured relational database. This allows for using SQL to get insight in the data. See the database schema in the appendix.

Results and output

example queries

Once having collected some data sqlite can be used to dive into it. Have a look at the database schema and try some of the following example queries.

tiktoktoad/webservice$ sqlite3 tiktok.db

# check for ENABLE_JSON1
sqlite> PRAGMA compile_options;
# get music urls from video json data
sqlite> select json_extract(video.data, '$.music.playUrl') from video;
# get video download url from video json data
sqlite> select json_extract(video.data, '$.video.downloadAddr') from video;
# get video by specific json data property
sqlite> select description from video where json_extract(video.data, '$.author.nickname') = 'Kika Kim';
# all videos with their label
sqlite> select label, video.description from label join video_label on label.id = video_label.label_id join video on video_label.video_id = video.id;
# find most recommend videos in all sessions
sqlite> select video.description, video.id, count(video_id) as recommends from recommend join video on recommend.video_id = video.id group by video_id order by recommends DESC;
# get events and specific data
sqlite> select name, timestamp, json_extract(event.data, '$.params.page_name') from event;
sqlite> select name, timestamp, json_extract(event.data, '$.params.duration') from event;
# find video's with a certain tag
sqlite> select video.id, video.description, json_extract(video.data, '$.video.downloadAddr') label from label join video_label on video_label.label_id = label.id join video on video_label.video_id = video.id where label = "Scripted Comedy";
#
sqlite> select timestamp, video.data from recommend join video on recommend.video_id = video.id order by timestamp desc limit 10;
# check most used labels
sqlite> select label, count(label) as count from label join video_label on label.id = video_label.label_id join video on video_label.video_id = video.id group by label order by count DESC ;

debugging

The background script currently does a little bit to much logging to the browser console. You may want to adapt this. To see the logs of the background script.

  1. Goto (paste in Firefox address bar):
about:debugging#/runtime/this-firefox
  1. Press Inspect

Appendix

db schema

                                                      ┌────────────┐  ┌────────────┐
                                                      │video_tag   │  │tag         │
                                                      │------------│  │------------│
                                                      │tag         │x─┤id          │
                                                   ┌─x│video       │  │tag         │
                                                   │  │            │  │            │
                                                   │  └────────────┘  └────────────┘

                                                   │  ┌────────────┐  ┌────────────┐
                                                   │  │video_label │  │label       │
                                                   │  │------------│  │------------│
                  ┌────────────┐                   │  │label       │x─┤id          │
                  │event       │                   ├─x│video       │  │label       │
                  │------------│    ┌────────────┐ │  │            │  │            │
                  │id          │    │video       │ │  └────────────┘  └────────────┘
                  │name        │    │------------│ │
                  │video       │x─┬─┤id          ├─┘  ┌────────────┐
                  │...         │  │ │title       │    │music       │
┌────────────┐    │timestamp   │  │ │description │    │------------│
│session     │ ┌─x│session     │  │ │music       │x───┤id          │
│------------│ │  └────────────┘  │ │...         │    │title       │
│id          ├─┤                  │ │author      │x─┐ │...         │
│video       │ │                  │ └────────────┘  │ │            │
│...         │ │  ┌────────────┐  │                 │ │            │
│timestamp   │ │  │recommend   │  │                 │ └────────────┘
│            │ │  │------------│  │                 │
│            │ │  │id          │  │                 │ ┌────────────┐
└────────────┘ │  │video       │x─┘                 │ │author      │
               │  │...         │                    │ │------------│
               │  │timestamp   │                    └─┤id          │
               ├─x│session     │                      │unique_name │
               │  │            │                      │nick        │
               │  └────────────┘                      │...         │
               │                                      │            │
               │  ┌────────────┐                      │            │
               │  │session_tag │                      └────────────┘
               │  │------------│
               │  │id          │
               │  │tag         │
               │  │timestamp   │
               └─x│session     │
                  │            │
                  └────────────┘

source: https://asciiflow.com/#/share/eJzNV91ugjAYfRXS682LXWzTZyExTJutUcBJMRpjsni9i114sQfwMXgan2SFtmC%2F%2FtjCsowQAuXznNPvF%2FcoS1KMJoiSBc0X6A4tkx1es4V9jLYxmowfx3cx2rG7h6dndkfxlrKHGEW9jsvp83L68Di%2FQmzjOOsr57ghc5xPafLKH5sLf2pNBsBX91eHhAdrQ%2BCB0i1bupzOZK5ssR9BA1XViI2LOrjfcxHDFhdFr7goa4MITn559B1i21vQoJ38n%2BoRgkT9LJMXvGxjx5%2Bk3bDYwWrxriBvB1T6byX5GewkoL7YOt7gjF4Dmmh4jR1NNebjRs0ZVYvrlyjSzQDUuwbZuuoOKQDu6K9rkXHWs00XBhPKVpNaVCoefnGCXfMzwPVW0tFopLqzvtS9ntAl7szE67QsyKyzZqD%2BCgQSJSkuaJKuAOEcF7M1WVGSZ1eEIN04YYGLojaTmrvEhm%2Fka6Db5VpOAdM8LJ86Ys29ke5cQWkJsR6zBjgp6Vu%2BbuMjdmSg5OAgtyIXdEDNaEAtIdx4j5GiK3QRwrySrWaNZ3ma4mweta3GCux5ckKjMPMIMRK6UUDXd8kOGL3%2Be2yFgM4qE80s5QjSUu86ElbNj2NkOATkzfErIdUMsEGait4MauwpRtjmRZmR9xJPmzHg0ql%2FgFoAMzJb3NIYNuQsRCAWFiKfw2eMh3UCC3EAkQxg87fLAdh%2F%2Fpsr3zByZQboI8diqP7xcRmq3c%2BRz3B6Wj4MDE3WaObbMtEBHX4A9kz2Lg%3D%3D)

typical event

An event typically has an event name and group_id (reference to video) and an author_id (reference to creator of the video).

{
  "event": "play_time",
  "params": "{\"page_name\":\"homepage_hot\",\"enter_from\":\"homepage_hot\",\"is_landing_page\":1,\"userAgent\":\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:96.0) Gecko/20100101 Firefox/96.0\",\"region\":\"NL\",\"page_url\":\"https://www.tiktok.com/\",\"group_id\":\"7051840377233476870\",\"author_id\":\"6944934559298618373\",\"play_mode\":\"one_column\",\"autoplay_status\":1,\"duration\":3643,\"enter_method\":\"\",\"search_id\":\"\",\"focus_time\":3327,\"is_from_webapp\":\"0\",\"event_index\":1643807729616}",
  "local_time_ms": 1643807314056,
  "is_bav": 0,
  "ab_sdk_version": "",
  "session_id": "9f1cdce1-6de1-4c0f-9216-51d41ffc87ab"
}

Other events:

  • video_request
  • interact_existed_video_start
  • interact_existed_video_end
  • change_cookie_setting
  • _be_active
  • play_time
  • video_pause
  • video_slide_up
  • video_play_finish
  • video_play_end
  • share_panel_show
  • switch_sound

single video item

A video contains a reference to its author. Music data, challenges and labels are include in the JSON structure.
Videos loaded in the initial state in the page source also contain "diversification labels" (a sort of category).

{
  "id": "7048695864558161158",
  "desc": "Besef dit was gewoon een jaar geleden 😂  #fyp #foryou",
  "createTime": "1641152396",
  "scheduleTime": 0,
  "video": {
    "id": "7048695864558161158",
    "height": 1024,
    "width": 576,
    "duration": 30,
    "ratio": "720p",
    "cover": "https://p16-sign-va.tiktokcdn.com/obj/tos-maliva-p-0068/218ee74c5ede481d887495a40ffb8f90_1641152398?x-expires=1643828400&x-signature=OSvgHp9EjBCmRtg4tD5wks1UHus%3D",
    "originCover": "https://p16-sign-va.tiktokcdn.com/obj/tos-maliva-p-0068/c9a1078161b8449e92ba53cb962b3c5c_1641152397?x-expires=1643828400&x-signature=jSw9sdS1g2Ak%2Fo448Rk9hRmCt5s%3D",
    "dynamicCover": "https://p16-sign-va.tiktokcdn.com/obj/tos-maliva-p-0068/675e1c4d84214fe7986a617a7a06f217_1641152397?x-expires=1643828400&x-signature=dKrajtemZfop7BafbCVdqDB0OAw%3D",
    "playAddr": "https://v16-webapp.tiktok.com/33afe9df5841517acb60ae47014332db/61fad6c8/video/tos/useast2a/tos-useast2a-ve-0068c004/3b60bc3c24ba4212b487ddb1aecace04/?a=1988&br=3046&bt=1523&cd=0%7C0%7C1%7C0&ch=0&cr=0&cs=0&cv=1&dr=0&ds=3&er=&ft=XOQ9-3tZnz7ThNjL.lXq&l=202202021308250102231230341A53D896&lr=tiktok_m&mime_type=video_mp4&net=0&pl=0&qs=0&rc=amhoNGg6ZnV5OjMzNzczM0ApOGk7NDw2Ojs7NzZkaTVmZ2dwNW9ocjRvcjVgLS1kMTZzc2JfYmA0M2BiYTY2My01MWA6Yw%3D%3D&vl=&vr=",
    "downloadAddr": "https://v16-webapp.tiktok.com/33afe9df5841517acb60ae47014332db/61fad6c8/video/tos/useast2a/tos-useast2a-ve-0068c004/3b60bc3c24ba4212b487ddb1aecace04/?a=1988&br=3046&bt=1523&cd=0%7C0%7C1%7C0&ch=0&cr=0&cs=0&cv=1&dr=0&ds=3&er=&ft=XOQ9-3tZnz7ThNjL.lXq&l=202202021308250102231230341A53D896&lr=tiktok_m&mime_type=video_mp4&net=0&pl=0&qs=0&rc=amhoNGg6ZnV5OjMzNzczM0ApOGk7NDw2Ojs7NzZkaTVmZ2dwNW9ocjRvcjVgLS1kMTZzc2JfYmA0M2BiYTY2My01MWA6Yw%3D%3D&vl=&vr=",
    "shareCover": [
      "",
      "https://p16-sign-va.tiktokcdn.com/tos-maliva-p-0068/c9a1078161b8449e92ba53cb962b3c5c_1641152397~tplv-tiktok-play.jpeg?x-expires=1644411600&x-signature=qc%2BEGGGx5Y%2FoIUKaZeqo7ymS4Gc%3D",
      "https://p16-sign-va.tiktokcdn.com/tos-maliva-p-0068/c9a1078161b8449e92ba53cb962b3c5c_1641152397~tplv-tiktokx-share-play.jpeg?x-expires=1644411600&x-signature=QaFUzRwde6CQ0%2BCDX4oFC4bJuCg%3D"
    ],
    "reflowCover": "https://p16-sign-va.tiktokcdn.com/obj/tos-maliva-p-0068/af9bd9723de38f0199e20f9aaa5807f3?x-expires=1643828400&x-signature=E8%2BuE8yxAtJS0ztdvdk%2BSZLVU58%3D",
    "bitrate": 1559564,
    "encodedType": "normal",
    "format": "mp4",
    "videoQuality": "normal",
    "encodeUserTag": "",
    "codecType": "h264",
    "definition": "720p"
  },
  "author": "iliasvietto17",
  "music": {
    "id": "7048695834539559685",
    "title": "origineel geluid",
    "playUrl": "https://sf77-ies-music-va.tiktokcdn.com/obj/musically-maliva-obj/7048695862213790469.mp3",
    "coverLarge": "https://p16-sign-va.tiktokcdn.com/tos-maliva-avt-0068/4caa1210488a1551c4804431171493ec~c5_1080x1080.jpeg?x-expires=1643893200&x-signature=yhgYrzGYcfKPRWTQ3AoWQZYQ9gc%3D",
    "coverMedium": "https://p16-sign-va.tiktokcdn.com/tos-maliva-avt-0068/4caa1210488a1551c4804431171493ec~c5_720x720.jpeg?x-expires=1643893200&x-signature=v%2B11cxHy0bpIBcUgBB9%2BrSQHwkk%3D",
    "coverThumb": "https://p16-sign-va.tiktokcdn.com/tos-maliva-avt-0068/4caa1210488a1551c4804431171493ec~c5_100x100.jpeg?x-expires=1643893200&x-signature=JfIIErCax2y83UXY0bvSI8DjMqA%3D",
    "authorName": "Viètto El Piètta",
    "original": true,
    "duration": 30,
    "album": "",
    "scheduleSearchTime": 0
  },
  "challenges": [
    {
      "id": "229207",
      "title": "fyp",
      "desc": "",
      "profileLarger": "",
      "profileMedium": "",
      "profileThumb": "",
      "coverLarger": "",
      "coverMedium": "",
      "coverThumb": "",
      "isCommerce": false
    },
    {
      "id": "42164",
      "title": "foryou",
      "desc": "",
      "profileLarger": "",
      "profileMedium": "",
      "profileThumb": "",
      "coverLarger": "",
      "coverMedium": "",
      "coverThumb": "",
      "isCommerce": false
    }
  ],
  "stats": {
    "diggCount": 83800,
    "shareCount": 1668,
    "commentCount": 2064,
    "playCount": 734900
  },
  "duetInfo": {
    "duetFromId": "0"
  },
  "warnInfo": [],
  "originalItem": false,
  "officalItem": false,
  "textExtra": [
    {
      "awemeId": "",
      "start": 42,
      "end": 46,
      "hashtagId": "229207",
      "hashtagName": "fyp",
      "type": 1,
      "subType": 0,
      "userId": "",
      "isCommerce": false,
      "userUniqueId": "",
      "secUid": ""
    },
    {
      "awemeId": "",
      "start": 47,
      "end": 54,
      "hashtagId": "42164",
      "hashtagName": "foryou",
      "type": 1,
      "subType": 0,
      "userId": "",
      "isCommerce": false,
      "userUniqueId": "",
      "secUid": ""
    }
  ],
  "secret": false,
  "forFriend": false,
  "digged": false,
  "itemCommentStatus": 0,
  "showNotPass": false,
  "vl1": false,
  "takeDown": 0,
  "itemMute": false,
  "effectStickers": [],
  "authorStats": {
    "followerCount": 601900,
    "followingCount": 194,
    "heart": 26800000,
    "heartCount": 26800000,
    "videoCount": 447,
    "diggCount": 18900
  },
  "privateItem": false,
  "duetEnabled": true,
  "stitchEnabled": true,
  "stickersOnItem": [],
  "isAd": false,
  "shareEnabled": true,
  "comments": [],
  "duetDisplay": 0,
  "stitchDisplay": 0,
  "indexEnabled": true,
  "diversificationLabels": [
    "Comedy",
    "Performance"
  ],
  "nickname": "Viètto El Piètta",
  "authorId": "6736668404306035718",
  "authorSecId": "MS4wLjABAAAAStO-uymL2x65GD8vrTc8cHnOkOWnhHqU1L9-BX_tuVTmugc6kk1j3Gghd-oA8YDB",
  "avatarThumb": "https://p16-sign-va.tiktokcdn.com/tos-maliva-avt-0068/4caa1210488a1551c4804431171493ec~c5_100x100.jpeg?x-expires=1643893200&x-signature=JfIIErCax2y83UXY0bvSI8DjMqA%3D",
  "batch": "1"
}

logged in user

The information on the currently logged in user (possibly you:)) is also in the initial application state object.

{
  "ftcUser": false,
  "secUid": "MS4wLjABAAAAC7jPWE5P0mAU4yBan6yaSr6bXpCJUf652omWJitiOrLdcZg1L-4hlQgE9J_cKjk_",
  "uid": "7048788831880922118",
  "nickName": "Rommel Zooi",
  "signature": "",
  "uniqueId": "rommelzooi",
  "createTime": "1641208395",
  "hasLivePermission": false,
  "roomId": "",
  "region": "",
  "avatarUri": ["https://p16-sign-va.tiktokcdn.com/musically-maliva-obj/70489…es=1641301200&x-signature=xtOG%2B33AKboit%2FCDKdbXPvbmHyA%3D"],
  "isPrivateAccount": false,
  "hasIMPermission": true,
  "showPrivateBanner": false,
  "showScheduleTips": false,
  "longVideoMinutes": 0,
  "ageGateRegion": "NL",
  "ageGateTime": "1641208394513",
  "userMode": 1,
  "proAccountInfo": {
    "status": 0,
    "analyticsOn": false,
    "businessSuiteEntrance": false,
    "downloadLink": {}
  },
  "analyticsOn": false,
  "redDot": []
}

References