Category:CypherTech: Difference between revisions
Jump to navigation
Jump to search
Line 2: | Line 2: | ||
= Data Processing Process = | = Data Processing Process = | ||
# $ python get_link_lists.py # daily | # $ python get_link_lists.py # daily | ||
## Needs data archival / backup. | |||
# $ python parquet_link_list.py ../data/reddit/link_list/day/science/ ../data/reddit/parquet/link_list/day/science/ | # $ python parquet_link_list.py ../data/reddit/link_list/day/science/ ../data/reddit/parquet/link_list/day/science/ | ||
## Make this iterate and do a full refresh each time | |||
## Reconsider full refresh when processing time goes over a minute | |||
# $ python dedupe_link_list_parquet.py | # $ python dedupe_link_list_parquet.py | ||
## Make this iterate and do a full refresh each time | |||
## Reconsider full refresh when processing time goes over a minute | |||
# $ python get_discussions.py | # $ python get_discussions.py | ||
## Make this check the existing download, harvest timestamp, num_comments | |||
# $ python discussion_to_gpt.py | # $ python discussion_to_gpt.py | ||
# hit ChatGPT with output | |||
## Change this to API call to ChatGPT | |||
## Make this check the existing generate, harvest versus generate timestamp | |||
= Interesting Subreddits = | = Interesting Subreddits = |
Revision as of 11:57, 25 August 2023
Data Processing Process
- $ python get_link_lists.py # daily
- Needs data archival / backup.
- $ python parquet_link_list.py ../data/reddit/link_list/day/science/ ../data/reddit/parquet/link_list/day/science/
- Make this iterate and do a full refresh each time
- Reconsider full refresh when processing time goes over a minute
- $ python dedupe_link_list_parquet.py
- Make this iterate and do a full refresh each time
- Reconsider full refresh when processing time goes over a minute
- $ python get_discussions.py
- Make this check the existing download, harvest timestamp, num_comments
- $ python discussion_to_gpt.py
- hit ChatGPT with output
- Change this to API call to ChatGPT
- Make this check the existing generate, harvest versus generate timestamp
Interesting Subreddits
- aitah
- antiwork
- ask
- askmen
- askreddit
- askscience
- chatgpt
- conservative
- dataisbeautiful
- explainlikeimfive
- latestagecapitalism
- leopardsatemyface
- lifeprotips
- news
- nostupidquestions
- outoftheloop
- personalfinance
- politics
- programmerhumor
- science
- technology
- todayilearned
- tooafraidtoask
- twoxchromosomes
- unpopularopinion
- worldnews
- youshouldknow
Reddit OAuth2
- https://www.reddit.com/r/redditdev/wiki/oauth2/explanation/
- https://www.reddit.com/dev/api/oauth/
- https://github.com/reddit-archive/reddit/wiki/OAuth2
Example Curl Request
curl
-X POST
-d 'grant_type=password&username=reddit_bot&password=snoo'
--user 'p-jcoLKBynTLew:gko_LXELoV07ZBNUXrvWZfzE3aI'
https://www.reddit.com/api/v1/access_token
Real Curl Request
curl
-X POST
-d 'grant_type=client_credentials'
--user 'client_id:client_secret'
https://www.reddit.com/api/v1/access_token
One Line
curl -X POST -d 'grant_type=client_credentials' --user 'client_id:client_secret' https://www.reddit.com/api/v1/access_token
Oauth Data Call
$ curl -H "Authorization: bearer J1qK1c18UUGJFAzz9xnH56584l4" -A "Traxelbot/0.1 by rbb36" https://oauth.reddit.com/api/v1/me
$ curl -H "Authorization: bearer J1qK1c18UUGJFAzz9xnH56584l4" -A "Traxelbot/0.1 by rbb36" https://oauth.reddit.com/r/news/top?t=day&limit=100
- https://old.reddit.com/r/worldnews/top/?sort=top&t=day
- /r/subreddit/top?t=day&limit=100
- count=100&
Reddit Python
- pip install aiofiles aiohttp asyncio
- https://realpython.com/async-io-python/
t3 fields of interest
- "url_overridden_by_dest": "https://www.nbcnews.com/politics/donald-trump/live-blog/trump-georgia-indictment-rcna98900",
- "url": "https://www.nbcnews.com/politics/donald-trump/live-blog/trump-georgia-indictment-rcna98900",
- "title": "What infamous movie plot hole has an explanation that you're tired of explaining?",
- "downs": 0,
- "upvote_ratio": 0.94,
- "ups": 10891,
- "score": 10891,
- "created": 1692286512.0,
- "num_comments": 8112,
- "created_utc": 1692286512.0,
Minimal Term Set
hands, mouth, eyes, head, ears, nose, face, legs, teeth, fingers, breasts, skin, bones, blood, be born, children, men, women, mother, father, wife, husband, long, round, flat, hard, soft, sharp, smooth, heavy, sweet, stone, wood, made of, be on something, at the top, at the bottom, in front, around, sky, ground, sun, during the day, at night, water, fire, rain, wind, day, creature, tree, grow (in ground), egg, tail, wings, feathers, bird, fish, dog, we, know (someone), be called, hold, sit, lie, stand, sleep, play, laugh, sing, make, kill, eat, drink, river, mountain, jungle/forest, desert, sea, island, rain, wind, snow, ice, air, flood, storm, drought, earthquake, east, west, north, south, bird, fish, tree, dog, cat, horse, sheep, goat, cow, pig (camel, buffalo, caribou, seal, etc.), mosquitoes, snake, flies, family, we, year, month, week, clock, hour, house, village, city, school, hospital, doctor, nurse, teacher, soldier, country, government, the law, vote, border, flag, passport, meat, rice, wheat, corn (yams, plantain, etc.), flour, salt, sugar, sweet, knife, key, gun, bomb, medicines, paper, iron, metal, glass, leather, wool, cloth, thread, gold, rubber, plastic, oil, coal, petrol, car, bicycle, plane, boat, train, road, wheel, wire, engine, pipe, telephone, television, phone, computer, read, write, book, photo, newspaper, film, money, God, war, poison, music, go/went, burn, fight, buy/pay, learn, clean
Pages in category "CypherTech"
The following 7 pages are in this category, out of 7 total.