Outils logiciels pour les cours Paris II

Cours Paris II

Stages/ Thèses/ Séminaires


edit SideBar


Collecting Facebook Data with Netvizz

Netvizz is a tool that extracts data from the Facebook platform.

File outputs can be easily analyzed in standard software.

Prerequisite :

You need to have an Facebook account.

If no ? create an account go to http://www.facebook.fr.

Step 1 :

After login on Facebook, go to the search bar (on the top), write "Netvizz" and click on.

Figure 1

Step 2 :

Select "page data" for collecting Facebook Fan Page data.

Figure 2

Step 3 :

Specify the number of posts or date range you want.

Enter the page ID ( for get the page id enter your Facebook Fan page URL here https://lookup-id.com/.

Select the get method "posts by page and users".

Figure 3

Step 4 :

Wait during the data extraction, big Fan page can take some time to process (minutes or hours).

Be patient and try not to reload!

When all it's OK, download the Zip archive

Figure 4

Step 5 :

After decompressing of the zip archive.

You have :

  • A tabular file that lists different metrics for each post.
  • A tabular file that lists basic stats per day for the period covered by the selected posts.
  • A tabular file that contains the text of user comments (users anonymized).
  • A bipartite graph file in gdf format that shows posts, users (anonymized), and connections between the two. A user is connected to a post if she commented or liked it.

Open the file containing the GDF format with Gephi.

The files containing the TAB format can be storage into a data-warehouse for exemple.

Figure 5

File Column Explanations :

For the GDF file we have :

type_post: Facebook's post classification (e.g. photo, status, etc.)

post_published: publishing date

post_published_unix: publishing date as Unix timestamp (for easy conversion and ranking)

user_locale: user selected interface language (empty if node is post)

user_sex: user specified sex (empty if node is post)

likes: number of actually retrieved likes a post received or a user made

likes_count_fb: Facebook provided like count for posts (can be higher than actually retrieved likes)

comments_all: number of comments made on a post or by a user

comments_base: number of base level comments (in threaded conversations)

comments_replies: number of reply level comments (in threaded conversations)

comments_count_fb: Facebook provided comment count for posts (can be higher than actually retrieved comment)

comment_likes: number of likes on comments (posts only)

shares: number of shares (posts only)

engagement: likes, comments_all, comment_likes, shares summed

post_id: id of the post (empty for users)

post_link: link of the post (empty for users)

edge weight encodes the number of times a user commented or liked a post

Other Module :

The following modules are currently available on Netvizz:

  • Group data - creates networks and tabular files for user activity around posts on groups
  • Page like network - creates a network of pages connected through the likes between them
  • Search - interface to Facebook's search function
  • Link stats - provides statistics for links shared on Facebook