Bennie Pie's List

🙋JP input required

  • ⭐Philosophy Videos - Need a yes or no
    Would JP like all Philosophy videos transcribing/summarising/publishing
  • ⭐Video File Backup - Need a yes or no
    I currently backup data (transcripts, python files, etc) to Google Drive. I do not back up ATP Geo video files as I don't download them. Would JP like me to backup video files?
  • Permanent Web Address - I'm going to change address to https://atpgeo.youtub.erg.uy unless JP has any other preferences-
  • ⭐Any preferences on design/wording/content that JP would like changing

📂Ghost Article data Load

  • ✅Error handling/restart
  • ✅Python Scheduler
  • ✅Create Post Backlog (1872 posts)
  • ✅Update Posts - working, just needs setting going
  • 🔄Write Newsletter/Configre Email Alert
  • ✅CSS Fixes

☁️Server /Network issues

  • ✅ fix dns config issue
  • ✅Fix http to http redirect
  • ✅Check that the OpenSearch cluster can still receive/reply to search queries, all on port 443 and hosts are not exposed
  • ✅Fix Opens Search Docker Compose to Restart on Boot
  • ✅Increase java memory allocation in docker compose for open search nodes
  • 🔄installed prometheus/graffana for monitoring
  • Add remote API endpoint monitoring, tailscale status, container status, VM hosts and DNS status, and Nginx service
    ✅Do something about logs (feed into Open Search and query with AI???)
  • ✅Backup Script
  • ✅Check everything reachable via Tailscale SSH
  • ✅Restore Nginx 301 redirect for http
  • ✅Configure new static IP front door and reverse proxies
  • ✅Nginx Lock down 443 for other hosts and return 444 on port 80 (default server)
  • Schedule daily cache clear to free memory

Video Summary Parsing issues:

  • 🔄🔜 fix paragraph breaks (result: giant paragraphs)
  • 🔄🔜table of contents missing (each summary should have a list of topics with shortcuts) to jump down the page) (nearly - just can't click the links!)
  • 🔄summaries are missing the source list box
  • ✅populate published time from youtube video upload time for relative time/article timestamp
  • ✅hits and losses should have disclaimer box
  • ✅front line update map should have map legend box
  • ✅formatting issue with timestamps (on summaries with timestamps)
  • ✅extra date needs removing
  • ✅some tagging is incorrect (update required)
  • 🆕image cropping/aspect ratios are a bit messy
  • 🆕remove title from homepage
  • ✅ added in header photos for each category
  • ✅Turn off newsletter box
  • ✅Change all posts link to JP author
  • 🆕Fix timestamps over 1 hour (see Uk Election Summary)
  • 🆕Show 'news' tag after sub-tags so that sub-tags are visible (e.g. Geopolitics show on index)
  • ✅Tag one-of a kind videos (e.g. UK election) - as "Extra" ? Tagged as "Elections 2024"
  • ✅ Fix tag spacing on summary page
  • ✅Format article date as UK long date
  • ✅ Inject country flag emoji into missing css styles (e.g. p, li)
  • ✅ added in header photos for each category
  • ✅Turn off newsletter box
  • ✅Fix target on timestamp links to open in new tab
  • 🔄🔜Add country flag emoji prefixes/tags/tags page
  • ✅Emoji flag picker (works on firefox but not chrome/edge) - not worth doing
  • 🔄🔜Merch/support page - will build you a nice page to showcase books, merch, buy me a coffee etc. All widgets and buttons added, just need adding to parse/post pipieline.
  • ✅Buy me a coffee widget added to each page
  • 🔄Each summary will also link to your merch/support page and there will be a buy coffee widget on every page.
  • ✅permanent web address
  • ✅emojis ?
  • 🔄🔜Correct EN-US to EN-GB during parse (got word list)
  • 🔄🔜 Spelling of Odesa grrrr
  • ✅Fix typo "Jonathan" on main page
  • ✅Fix UL lists not merging when separated by two line breaks
  • 🔄🔜Handle jax queue exceptions, proxy failure, timeout, candidates errors to resstart tasks script (see below for proxy failure, timeout, candidate errors)
  • 🔄🔜To improve reliability with the LLM summaries (regular proxy failures) instead of redirecting the connections to USA via a VPN, I've spun up a Google Cloud Virtual Machine in USA to handle the connections - should be more reliable. Just need to finish off config
  • 🔄Add Book Carrousel
  • 🆕Add Merch Carousel
  • 🆕Add Donate to Greg Terry widget for streams with Greg
  • 🔄🔜Add BMAC BUttons. Buttons done, need putting in place
  • 🆕Add Donate (Paypal)
  • ✅Submit to Bing/Google
  • ✅Post topic titles excerpt meta data for search (ghostpost)

🤖Workflow Orchestration (Python Scripts Monitored with Prefect )

  • ✅Re-run article creation with fixes
  • 🆕🙋 ATP Geo videos on A Tippling Philosopher YT channel
  • 🔄Add to all python scripts and scheduling and monitoring to Prefect
  • 🔄🔜some titles aren't correct (some are AI mistakes)
  • 🆕some quotes are a bit iffy (eg out of context, will re-run)
  • 🔄🔜Add clickable "re-run" icon against the summary/title/quote/tags/chapter to record an issue with AI content - trigger automatic submission to AI ( will need to restrict this Ben/JP (will create login). Underway.
  • 🔄🔜 Some videos are rejected by the safety filter and fail- false positives obvs! Will fix and re-run. Most are rejected due to length currently. Gemini AI Flash seems to work for these so will re-run.
  • 🔄videos over 2 hours need to use a different transcription service (or i need to split the audio) - these are generally interviews and the AI probably needs to summarise these in a different way as they are very different to the normal videos. Write workflow - Got a plan, have started on this
  • 🔄Add Discord AI bot to test channel (start with simple discord alert bot)
  • 🆕 Discord notification for new summaries to orchestration
  • 🔄Add conversational AI bot to interact with semantic search.

🔍Search (OpenSearch Cluster)

  • ✅ Change the search to return topics/chapters (so you can see relevant part of each video with a link to the youtube video at the timestamp) rather than search returning just an entire video, so index needs a slight change
  • ✅Re-index lexical search
  • ✅Turn on search icon pop-up in menu (turned off as it breaks hybrid search)
  • ✅fix event listener
  • ✅Add topics to excerpt and truncate to 300 chars
  • ✅Need to add the meta data to the vector embedding index pipeline and strip HTML. / add kNN
  • ✅Update search results page
  • ✅Change the index process to only add new records / updated records rather than the current automatic index rebuild after every new article - this caused massive server lag after each post. Adding new posts work - check updates
  • ✅Fix indexing for topics without timestamps
  • ✅Max results default ignored and actual limit a bit random
  • ✅Analytics - Open Search Dashboards - build
  • ✅Fix Docker container for Tailscale socks 5 proxy
  • 🆕Sort the search results format/design/colour
  • ✅Fix search performance issues - turned off swap on ghost web server, nginx configured on Oracle instance - seems to perform ok with 50-100 results
  • 🆕Publish stats on indexing
  • 🆕Create a dashboard to see all the tasks at a glance
  • 🔄Add REACT search box
  • 🔄Add REACT grid cards
  • 🔄Add REACT javascript timelines
  • 🆕Write some prompts for various outputs