Colabratory uses python3 to collect Twitter hashtags, and also collects the combinations of hashtags included in tweets to determine the degree of duplication.
Initially we were collecting with Tweety, but with the TwitterAPI method we can not collect retroactively over a week ago, so we decided to collect tweets using a package called GetOldTweets. did.
However, when I changed the program made using Tweepy to a program that uses GetOldTweets, the following error message occurred.
Addition: As a result of executing the response, an error occurred again. The corresponding source code has also been changed.Error message
Applicable source code
AttributeError Traceback (most recent call last) <ipython-input-35-d2284eea22e6>in<module>() 15 for v in tweet: 16 print (v.text) --->17 for tag0, tag1 in itertools.combinations (v.entities ['hashtags'], 2): 18 tag0 = tag0 ['text'] 19 tag1 = tag1 ['text'] AttributeError: 'Tweet' object has no attribute 'entities'
! git clone https://github.com/Jefferson-Henrique/GetOldTweets-python ! pip install lxml pyquery import os os.chdir ('GetOldTweets-python') import got3 as got import json import itertools import networkx as nx G = nx.Graph () tweetCriteria = got.manager.TweetCriteria (). setQuerySearch ('# heatstroke'). setSince ("2018-07-10"). setUntil ( "2018-08-30"). SetMaxTweets (10000) tweet = got.manager.TweetManager.getTweets (tweetCriteria) print (tweet) for v in tweet: print (v.text) for tag0, tag1 in itertools.combinations (v.entities ['hashtags'], 2): tag0 = tag0 ['text'] tag1 = tag1 ['text'] if G.has_edge (tag0, tag1): G [tag0] [tag1] ["weight"] + = 1 else: G.add_edge (tag0, tag1, weight = 1)
A program created using Tweepy was replaced with GetOldTweets and executed with a star.
However, there is nothing to see how to build a program, and I have no idea what to do to try out the method.
from tweepy.streaming import StreamListener import json import networkx as nx G = nx.Graph () class MyStreamListener (StreamListener): def __init __ (self, api, ** kw): self.api = api super (tweepy.StreamListener, self) .__ init __ () self.twcnt = 0 def on_status (self, tweet): self.twcnt + = 1 for tag0, tag1 in itertools.combinations (tweet.entities ['hashtags'], 2): tag0 = tag0 ['text'] tag1 = tag1 ['text'] if G.has_edge (tag0, tag1): G [tag0] [tag1] ["weight"] + = 1 else: G.add_edge (tag0, tag1, weight = 1) if self.twcnt>10000: return False def on_error (self, status): return True auth = tweepy.OAuthHandler (consumer_key, consumer_secret) auth.set_access_token (access_token, access_token_secret) api = tweepy.API (auth) stream = tweepy.Stream (auth, MyStreamListener (api)) stream.filter (track = ['Follow #RT people'])
The following image is a part of the result when collecting in real time using the above program.
I would like to use GetOldTweets to extract like the above image.
Answer # 1
I have never used GetOldTweets, or even Tweety.
For now, the current error can be solved by importing itertools.
Because it was a boat I boarded, I tried it a little at hand.Local environment
You can get hashtags by the following means:
Microsoft Windows [Version 10.0.17134.228]
Python 3.6.6 :: Anaconda, Inc.Remodeled GetOldTweets
I don't know if the Twitter specification has changed, but I can't get the hashtag right now.
Replace the code on line 39 of got3.manager.TweetManager.py as follows:
# txt = re.sub (r "\ s +", "", tweetPQ ("p.js-tweet-text"). text (). replace ('# ',' # '). replace (' @ ',' @ ')) txt = tweetPQ ("p.js-tweet-text"). text () txt = re.sub (r "# \ s?", '#', txt) txt = re.sub (r "@ \ s?", '@', txt) txt = re.sub (r "\ s +", '', txt)
import got3 as got tweet_criteria = got.manager.TweetCriteria () \ .setQuerySearch ('# heatstroke') \ .setSince ("2018-07-10") \ .setUntil ("2018-08-30") \ .setMaxTweets (10) tweets = got.manager.TweetManager.getTweets (tweet_criteria) for tweet in tweets: hash_tags = [ tag.lstrip ('#') for tag in tweet.hashtags.split () if tag! = '#' ] print (hash_tags)
After executing the above code, the following results were obtained.
['heatstroke&apos ;,'heatstroke measures&apos ;,'hot&apos ;,'blog update http'] ['care&apos ;,'heatstroke'] ['Kanto&apos ;,'weather&apos ;,'heatstroke'] ['heatstroke'] ['heatstroke'] ['vertigo&apos ;,'vomiting&apos ;,'heatstroke&apos ;,'convulsions&apos ;,'muscle pain'] ['heatstroke'] ['Gyudon&apos ;,'Yoshinoya&apos ;,'Frozen&apos ;,'Free Shipping&apos ;,'Rakuten&apos ;,'Summer&apos ;,'Hot&apos ;,'Heatstroke&apos ;] ['heatstroke attention&apos ;,'heatstroke prevention&apos ;,'heatstroke measures&apos ;,'severe heat&apos ;,'heatstroke'] ['drug&apos ;,'pharmacist&apos ;,'heatstroke&apos ;,'dispensing pharmacy&apos ;,'dehydration&apos ;,'oral rehydration solution&apos ;,'dehydration symptoms']
If you take this combination, it should work.where
GetOldTweets doesn't seem to be well maintained for Python3.x.
(If you look at the contents of the error, I'm wondering if it works correctly even with the 2nd line.)
This package assumes using Python 2.x.The Python3"got3"folder is maintained as experimental and is not officially supported.
Source: GetOldTweets-python/README.mdBoldis the quote
I ’m sorry I could n’t suggest an alternative, but I ’d recommendalternatives.
- about a program that writes folder names&file names to excel with python
- python - why the image recognition program is not learning
- python - get and display twitter timeline
- python - i want to send dm with twitter api, but an error is returned
- i want to get the twitter timeline using python, but i get an error
- acquire the twitter of python twitter and store it in dataframe
- python - get tweets by specifying time with twitter api
- python - why different results are obtained with the same program
- python 3x - error in sample program of anaphora analysis of natural language processing using spacy and neuralcoref
- python - difference in execution speed (program executed at a specified time)
- python 3x - displaying ie11 with selenium causes the program to stop halfway
- python - about the program for electronic bulletin board made by raspberry pi
- python 3x - a program that outputs the type and number of characters contained in a character string
- python 3x - external program execution fails
- python - i want to get follower information using twitter api (more than 5000)
- i got an error when i tried to run a program that is a python for statement
- a program to output a perfect number in python
- python - raspai ad conversion program
- python - what is the character code of the program standard output?
- python - cannot pull 3 or more values into the program with the input command (hang!)
- python 3x - typeerror: 'method' object is not subscriptable
- python - you may need to restart the kernel to use updated packages error
- xcode - pod install [!] no `podfile 'found in the project directory
- vuejs - [vuetify] unable to locate target [data-app] i want to unit test to avoid warning
- android studio - emulator: dsound: could not initialize about the error message directsoundcapture
- android studio - unresolved reference comes out in kotlin
- mysql startup failed [error] innodb: the innodb_system data file 'ibdata1' must be writable
- django - oserror: [winerror 123] the file name, directory name, or volume label syntax is incorrect : '<frozen importlib_boot
- python - importerror: cannot import name md5 error cannot be resolved