llm summarization

using large-language-models for summarization

2 min, 335 words

notes

  • I did a bunch of googling and found that most summarization seemed to involved extracting actual-text segments that were representative of the overall document. This is even with newer code using large-language-models trained to be good for extraction. Some experimental work.

  • I didn't want that -- I instead wanted what we'd think of as summaries.

  • I knew ChatGPT could do a fantastic job on this, and then remembered work some six-months ago I did on getting an open-source chat-oriented large-language-model running, for experimentation.

  • I got that code running again, following a video-tutorial, and built on it to experiment with using chat for summarization.

    • Interesting: when I tried that code six months ago, it worked pretty smoothly. But now, that model is old (the link 404ed). I had a hard time finding it, and the libraries that worked with it then no longer do (I had to downgrade to older versions). Shows how fast things are changing!
  • Though I did get the summary-of-summaries approach working, for the demo I switched back to a simpler approach of just summarizing the first 1,000 words.

    • for the demo, this produced good results, due to the maximum text often being handled due to everything being single-page scans.
    • i keep hearing that newer models are both better, and faster, and handle larger numbers of tokens -- so for the Hall-Hoag project, I may not use the summary-of-summaries approach -- except to experiment with organization-as-a-whole summarization.
  • Note the prompt-experimentation for the description-text -- and for subtitle-text.

  • Note also meta experience of using a large-language-model to explain the knobs (parameters) for working with a large-language-model. :)