20210721174643_uhys2_1.gif

*最终的动态条形图

如果你是一个Python爱好者,并且和本文的作者一样对于R有一些数据库很感兴趣,比如ganimate,并且想利用这些数据库去实现一些功能,那这篇文章你就必须要看!

首先,本文作者使用Kaggle中最受欢迎的数据库之一创建了一个滑动条形图(如上图),点击下方链接查看数据库详情

Spotify's Worldwide Daily Song Ranking | Kaggle

本文作者首先使用good old pandas中的read_csv函数来读取已经收集到的csv文件中的数据并创建一个与之相关的数据帧。数据帧的前五行代码如下所示:

# Read the Spotify dataset
df = pd.read_csv('spotify_dataset.csv')

df.head()

20210728120828_tij1j_2.png

在本位作者为以R为基础的gif动画开始构造最终的数据帧之前,首先做了以下几件首先做了以下几件事情,用于评估数据

  • 创建字典,用首字母缩写的区域值来替换真正的区域名称
  • 用日期时间来代替日期列的类型(对象),并创建额外的列,如年、月、日和周的日期,简化使用流程
  • 为柱状图的Y轴创建一个名为Title的新列
# Create a dictionary mapping for the real region values (https://www.spotify.com/us/select-your-country/)
region_dic = {'ar':'Argentina', 'at':'Austria', 'au':'Australia', 'be':'Belgium', 'bo':'Bolivia', 'br':'Brazil', 'ca':'Canada', 'ch':'Switzerland', 'cl':'Chile', 'co':'Columbia', 'cr':'CostaRica', 'cz':'CzechRepublic', 'de':'Germany', 'dk':'Denmark', 'do':'DominicanRepublic', 'ec':'Ecuador', 'ee':'Estonia', 'es':'Spain', 'fi':'Finland', 'fr':'France', 'gb':'UnitedKingdom', 'global':'Global', 'gr':'Greece', 'gt':'Guatemala', 'hk':'HongKong', 'hn':'Honduras', 'hu':'Hungary', 'id':'Indonesia', 'ie':'Ireland', 'is':'Iceland', 'it':'Italy', 'jp':'Japan', 'lt':'Lithuania', 'lu':'Luxemborg', 'lv':'Latvia', 'mx':'Mexico', 'my':'Malaysia', 'nl':'Netherlands', 'no':'Norway', 'nz':'NewZealand', 'pa':'Panama', 'pe':'Peru', 'ph':'Philippines', 'pl':'Poland', 'pt':'Portugal', 'py':'Paraguay', 'se':'Sweden', 'sg':'Singapore', 'sk':'Slovakia', 'sv':'ElSalvador', 'tr':'Turkey', 'tw':'Taiwan', 'us':'USA', 'uy':'Uruguay'}# Replace with the true Region names
df = df.replace({"Region":region_dic})# Replace the Date type for ease of use and creating extra columns
df.Date = pd.to_datetime(df["Date"])# Create year, month, day and day of the week columns
df['Year'] = df['Date'].dt.year
df['Month'] = df['Date'].dt.month
df['Day'] = df['Date'].dt.day
df['Day_of_week'] = df['Date'].dt.dayofweek# Combine Track name and Artist for ease of use
df['Title'] = df['Artist'] +' - '+ df['Track Name']df.head()

20210728120842_fljdj_3.png

最后,本文作者为每个地区的每个月(在美国这种情况下)创建了单独的数据结构,然后将所有的数据串联成一个结构,如下所示。

索引(月份从1到10),标题(Y轴),数据流(X轴)和年月的分组。当我们了解到每个月数据流的综合代表年初至今的数字,这样我们就在时间轴上实现了增长的条形。然后,我把最后一帧写入csv文件,以便在Rtudio中进行动画制作。

df_usa_1701 = df[(df.Region == "USA") & (df.Year == 2017) & (df.Month.isin([1]))].groupby(["Title"],as_index=False).agg({"Streams": "sum"}).sort_values(["Streams"], ascending=[False]).head(10).reset_index(drop=True)
df_usa_1701.index = df_usa_1701.index + 1
df_usa_1701["YearMonth"] = 201701
df_usa_1701 = df_usa_1701.reset_index()df_usa_1702 = df[(df.Region == "USA") & (df.Year == 2017) & (df.Month.isin([1,2]))].groupby(["Title"],as_index=False).agg({"Streams": "sum"}).sort_values(["Streams"], ascending=[False]).head(10).reset_index(drop=True)
df_usa_1702.index = df_usa_1702.index + 1
df_usa_1702["YearMonth"] = 201702
df_usa_1702 = df_usa_1702.reset_index().
.
.df_usa_1801 = df[(df.Region == "USA") & (df.Year.isin([2017,2018])) & (df.Month.isin([1,2,3,4,5,6,7,8,9,10,11,12]))].groupby(["Title"],as_index=False).agg({"Streams": "sum"}).sort_values(["Streams"], ascending=[False]).head(10).reset_index(drop=True)
df_usa_1801.index = df_usa_1801.index + 1
df_usa_1801["YearMonth"] = 201801
df_usa_1801 = df_usa_1801.reset_index()frames = [df_usa_1701, df_usa_1702, df_usa_1703, df_usa_1704, df_usa_1705, df_usa_1706, df_usa_1707, df_usa_1708, df_usa_1709, df_usa_1710, df_usa_1711, df_usa_1712, df_usa_1801]
df_usa_merged = pd.concat(frames)df_usa_merged.to_csv('df_usa_merged.csv')

20210728120852_943jd_4.png

在接下来的步骤中,我们需要用到ggplot2、ganimate和readr等软件包来制作动画和读取我们之前创建的csv文件。使用ggplot函数可以设置所有的功能,如索引、组和颜色。剩余的其他颜色、纹理和尺寸规格都由偏好设置决定。 接下来一步是设置animate变量,选择过渡和状态的长度。再使用animate函数之前,我们需要一个gif渲染器,为此本文作者使用了名为gifski_renderer的包。最后,由所需要的帧数、每秒的帧数以及宽度和高度值来设置animate函数。 代码如下所示:

install.packages("ggplot2")
install.packages("gganimate")

library(readr)
library(gganimate)

df_r_usa <- read_csv("./df_usa_merged.csv")


staticplot = ggplot(df_r_usa, aes(index, group = Title, 
                                   fill = as.factor(Title), color = as.factor(Title))) +
  geom_tile(aes(y = Streams/2,
                height = Streams,
                width = 0.9), alpha = 0.8, color = NA) +
  geom_text(aes(y = 0, label = paste(Title, " ")), vjust = 0.2, hjust = 1) +
  geom_text(aes(y=Streams,label = Streams, hjust=0)) +
  coord_flip(clip = "off", expand = FALSE) +
  scale_y_continuous(labels = scales::comma) +
  scale_x_reverse() +
  guides(color = FALSE, fill = FALSE) +
  theme(axis.line=element_blank(),
        axis.text.x=element_blank(),
        axis.text.y=element_blank(),
        axis.ticks=element_blank(),
        axis.title.x=element_blank(),
        axis.title.y=element_blank(),
        legend.position="none",
        panel.background=element_rect(colour = "lightcyan1"),
        panel.border=element_blank(),
        panel.grid.major=element_blank(),
        panel.grid.minor=element_blank(),
        panel.grid.major.x = element_line( size=.1, color="grey" ),
        panel.grid.minor.x = element_line( size=.1, color="grey" ),
        plot.title=element_text(size=25, hjust=0.5, face="bold", colour="grey", vjust=-1),
        plot.subtitle=element_text(size=18, hjust=0.5, face="italic", color="grey"),
        plot.caption =element_text(size=12, hjust=0.5, face="italic", color="grey"),
        plot.background=element_blank(),
        plot.margin = margin(2 ,8, 2, 4, "cm"))



anim = staticplot + transition_states(YearMonth, transition_length = 4, state_length = 1) +
  view_follow(fixed_x = TRUE)  +
  labs(title = 'Year_Month : {closest_state}',  
       subtitle  =  "Top 10 Titles (USA)",
       caption  = "Top 10 USA Titles by their streamings")


install.packages("gifski_renderer")

animate(anim, 400, fps = 20,  width = 1200, height = 800, 
        renderer = gifski_renderer("spotify_usa.gif"))

*anim_bar_plot.R由GitHub主办(点击查看源代码

完整的python代码可以在下面的链接中找到。

[animated_bar_plot/spotify animated bar plot.py at master · korkmazarda/animated_bar_plot · GitHub](https://github.com/korkmazarda/animated_bar_plot/blob/master/spotify animated bar plot.py)

点击查看原文链接

当然,如果觉得这个麻烦,可以直接用我们高下制图画哦 😉