Keyword Network and Word Frequency Analysis

An analysis in Python and R.

Project Summary

I collaborated with a partner to perform keyword network and word frequency analysis in Python and R. In R, keyword data was extracted from a given file and converted to a weighted adjacency matrix. The adjacency matrix was read and converted to a weighted network. The node degree and strength were computed, and the top 10 nodes by degree, by strength, and the pairs by weight were shown. The average strength vs. degree was plotted.

In Python, using Google Colab, the Twitter data of Elon Musk from 2017 to 2022 was analyzed, and the word frequencies for each year were computed. The top 10 words for each year by the highest value of word frequency were shown. A histogram of word frequencies for each year was plotted. Using Zipf’s law, a log-log plot of word frequencies and rank for each year was plotted. Bigram network graphs for each year were plotted.

Project Links

Python Code