LLM aids for processing of the first Carlson-Putin interview (in Raku)
LLM aids for processing of the first Carlson-Putin interview (in Raku)
Introduction
Introduction
In this notebook we provide aids and computational workflows for the analysis of the first Carlson-Putin interview held in February 9th, 2024 . We mostly use Large Language Models (LLMs). We walk through various steps involved in examining and understanding the interview in a systematic and reproducible manner.
The computations are done with a Raku chatbook , [AAp6, AAv1÷AAv3]. The LLM functions used in the workflows are explained and demonstrated in [AA1, AAv3]. The workflows are done with OpenAI's models [AAp1]; the models of Google (PaLM), [AAp2], and MistralAI, [AAp3], can be also used for the Part 1 summary and the search engine. The related images were generated with workflows described in [AA2].
Out[]=
Structure
Structure
The structure of the notebook is as follows:
1
.Getting the interview text
Standard ingestion.
Standard ingestion.
2
.Preliminary LLM queries
What are the most important parts or most provocative questions?
What are the most important parts or most provocative questions?
3
.Part 1: separation and summary
Overview of the historical review.
Overview of the historical review.
4
.Part 2: thematic parts
TLDR via a table of themes.
TLDR via a table of themes.
5
.Interview's spoken parts
Non-LLM extraction of participants' parts.
Non-LLM extraction of participants' parts.
6
.Search engine
Fast results with LLM embeddings.
Fast results with LLM embeddings.
7
.Flavored variations
How would Hillary phrase it? And how would Trump answer it?
How would Hillary phrase it? And how would Trump answer it?
Sections 5 and 6 can be skipped -- they are (somewhat) more technical.
Observations
Observations
◼
Using the LLM functions for programmatic access of LLMs speeds up the efforts, I would say, by factor 3-5 times.
◼
The workflows presented below are fairly universal -- with small changes the notebook can be applied to other interviews.
◼
Using OpenAI's preview model "gpt-4-turbo-preview" spares or simplifies a fair amount of workflow elements.
◼
The model "gpt-4-turbo-preview" takes input with 128K tokens.
◼
Hence the whole interview can be processed in one LLM request.
◼
Since I watched the interview, I can see that the LLM results for most provocative questions or most important statements are good.
◼
It is interesting to think about believing those results by people who have not watched the interview.
◼
The search engine can be replaced or enhanced with a Question Answering System (QAS).
◼
The flavored variations might be too subtle.
◼
I expected more obvious manifestation of characters involved.
Getting the interview text
Getting the interview text
The interviews are taken from the dedicated Kremlin's page "Interview to Tucker Carlson" , hosted at en.kremlin.ru .
Here we load a package and define a text statistics function:
In[]:=
use HTTP::Tiny;
sub text-stats(Str:D $txt) { <chars words lines> Z=> [$txt.chars, $txt.words.elems, $txt.lines.elems] }
sub text-stats(Str:D $txt) { <chars words lines> Z=> [$txt.chars, $txt.words.elems, $txt.lines.elems] }
Out[]=
&text-stats
Here we ingest interview's text:
my $url = 'https://raw.githubusercontent.com/antononcube/SimplifiedMachineLearningWorkflows-book/master/Data/Carlson-Putin-interview-2024-02-09-English.txt';
my $txtEN = HTTP::Tiny.new.get($url)<content>.decode;
$txtEN .= subst(/ \v+ /, "\n", :g);
text-stats($txtEN)
my $txtEN = HTTP::Tiny.new.get($url)<content>.decode;
$txtEN .= subst(/ \v+ /, "\n", :g);
text-stats($txtEN)
Out[]=
(chars => 97354 words => 16980 lines => 292)
Preliminary LLM queries
Preliminary LLM queries
Here we configure LLM access -- we use OpenAI's model "gpt-4-turbo-preview" since it allows inputs with 128K tokens:
In[]:=
my $conf = llm-configuration('ChatGPT', model => 'gpt-4-turbo-preview', max-tokens => 4096, temperature => 0.2, api-key => $openai-auth-key);
$conf.Hash.elems
$conf.Hash.elems
Out[]=
21
Questions
Questions
First we make an LLM request about the number of questions asked:
In[]:=
llm-synthesize(["How many questions were asked in the following interview?", $txtEN], e => $conf)
Out[]=
In the interview between Tucker Carlson and President Vladimir Putin, a total of 75 questions were asked by Tucker Carlson.
Here we ask the questions to be extracted into a JSON list:
In[]:=
my $llmQuestions = llm-synthesize([
"Extract all questions from the following interview into a JSON list.",
$txtEN,
llm-prompt('NothingElse')('JSON')
], e => $conf, form => sub-parser('JSON'):drop);
deduce-type($llmQuestions)
"Extract all questions from the following interview into a JSON list.",
$txtEN,
llm-prompt('NothingElse')('JSON')
], e => $conf, form => sub-parser('JSON'):drop);
deduce-type($llmQuestions)
Out[]=
Vector(Atom((Str)), 29)
We can see that the LLM-extracted questions are two times less than the LLM-obtained number of questions above. Here are the extracted questions (in a multi-column format):
In[]:=
RakuInputExecute["$llmQuestions>>.values.flat ==> encode-to-wl"]//Multicolumn[ToExpression[#][[1]],3,Dividers->All]&
Out[]=
Tell us why you believe the United States might strike Russia out of the blue. How did you conclude that? | But you haven’t spoken to him since before February of 2022? | What are those power centres in the United States, do you think? And who actually makes the decisions? |
You were initially trained in history, as far as I know? | You do not remember?! | So do you see the supernatural at work? As you look out across what’s happening in the world now, do you see God at work? Do you ever think to yourself: these are forces that are not human? |
I beg your pardon, can you tell us what period… I am losing track of where in history we are. | Do you think NATO was worried about this becoming a global war or nuclear conflict? | When does the AI empire start do you think? |
It doesn’t sound like you are inventing it, but I am not sure why it’s relevant to what’s happened two years ago. | Can you imagine a scenario where you send Russian troops to Poland? | What do you think of that? |
May I ask… You are making the case that Ukraine, certain parts of Ukraine, Eastern Ukraine, in fact, has been Russia for hundreds of years. Why wouldn’t you just take it when you became President 24 years ago? You have nuclear weapons, they don’t. It’s actually your land. Why did you wait so long? | Well, the argument, I know you know this, is that, well, he invaded Ukraine – he has territorial aims across the continent. And you are saying unequivocally, you don’t? | Evan Gershkovich who is the Wall Street Journal reporter, he is 32 and he’s been in prison for almost a year. This is a huge story in the United States and I just want to ask you directly without getting into details of your version of what happened, if as a sign of your decency you’ll be willing to release him to us and we’ll bring him back to the United States? |
Do you believe Hungary has a right to take back its land from Ukraine? And that other nations have a right to go back to their 1654 borders? | Who blew up Nord Stream? | Are you suggesting he was working for the US government or NATO? Or he was just a reporter who was given material he wasn’t supposed to have? Those seem like very different, very different things. |
Have you told Viktor Orban that he can have a part of Ukraine? | Do you have evidence that NATO or the CIA did it? | I think you are saying you want a negotiated settlement to what's happening in Ukraine. |
Were you sincere? Would you have joined NATO? | Why wouldn’t you present it and win a propaganda victory? | Would you be willing to say, “Congratulations, NATO, you won?” And just keep the situation where it is now? |
Why do you think that is? Just to get to motive. I know, you’re clearly bitter about it. I understand. But why do you think the West rebuffed you then? Why the hostility? Why did the end of the Cold War not fix the relationship? What motivates this from your point of view? | Why are they being silent about it? That is very confusing to me. Why wouldn’t the Germans say something about it? | Do you think it is too humiliating at this point for NATO to accept Russian control of what was two years ago Ukrainian territory? |
What did he say? | So, you said a moment ago that the world would be a lot better if it were not broken into competing alliances, if there was cooperation globally. One of the reasons you don’t have that is because the current American administration is dead set against you. Do you think if there was a new administration after Joe Biden that you would be able to re-establish communication with the US government? Or does it not matter who the President is? | |
Important parts
Important parts
Here we make a function of extracting significant parts from the interview:
In[]:=
my &fProv = llm-function({"Which are the top $^a most $^b in the following interview?" ~ $txtEN}, e => $conf)
Most provocative questions
Most provocative questions
Here we get the most provocative questions:
In[]:=
#% markdown
&fProv(3, 'provocative questions')
&fProv(3, 'provocative questions')
Based on the content and context of the interview between Tucker Carlson and President Vladimir Putin, identifying the top three most provocative questions involves subjective judgment. However, considering the potential for controversy, international implications, and the depth of response they elicited, the following three questions can be considered among the most provocative:
1
.NATO Expansion and Perceived Threats to Russia:
◼
Question: "On February 24, 2022, you addressed your country in your nationwide address when the conflict in Ukraine started and you said that you were acting because you had come to the conclusion that the United States through NATO might initiate a quote, “surprise attack on our country.” And to American ears that sounds paranoid. Tell us why you believe the United States might strike Russia out of the blue. How did you conclude that?"
◼
Context: This question directly challenges Putin's justification for the military actions in Ukraine, suggesting paranoia, and seeks an explanation for Russia's perceived threat from NATO and the U.S., which is central to understanding the conflict's origins from Russia's perspective.
2
.Possibility of Negotiated Settlement in Ukraine:
◼
Question: "Do you think Zelensky has the freedom to negotiate the settlement to this conflict?"
◼
Context: This question probes the autonomy and authority of Ukrainian President Volodymyr Zelensky in the context of peace negotiations, implicitly questioning the influence of external powers (notably the U.S.) on Ukraine's decision-making and the potential for resolving the conflict through diplomacy.
3
.Use of Nuclear Weapons and Global Conflict:
◼
Question: "Do you think NATO was worried about this becoming a global war or nuclear conflict?"
◼
Context: Given the nuclear capabilities of Russia and the escalating tensions with NATO, this question touches on the fears of a broader, potentially nuclear, conflict. Putin's response could provide insights into Russia's stance on the use of nuclear weapons and its perception of NATO's concerns about escalation.
These questions are provocative due to their direct challenge to Putin's actions and rationale, their exploration of sensitive geopolitical issues, and their potential to elicit responses that could have significant international repercussions.
Most important statements
Most important statements
Here we get the most important statements:
#% markdown
&fProv(3, 'important statements')
&fProv(3, 'important statements')
Based on the extensive interview, the top 3 most important statements that stand out for their significance in understanding the broader context of the conversation and the positions of the involved parties are:
1
.Vladimir Putin's assertion on NATO expansion and its impact on Russia : Putin's repeated emphasis on NATO's expansion as a direct threat to Russia's security and the broken promises regarding NATO not expanding eastward. This is a critical point as it underlines Russia's longstanding grievance and justification for its actions in Ukraine, reflecting the deep-seated geopolitical tensions between Russia and the West.
2
.Putin's readiness for a negotiated settlement in Ukraine : Putin's statements indicating a willingness to negotiate a settlement for the conflict in Ukraine, blaming the West and Ukraine for the lack of dialogue and suggesting that the ball is in their court to make amends and come back to the negotiating table. This is significant as it portrays Russia's stance on seeking a diplomatic resolution, albeit under conditions that are likely to favor Russian interests.
3
.Discussion on the potential global implications of the conflict : The dialogue around the fear of the conflict in Ukraine escalating into a larger, possibly global war, and the mention of nuclear threats. This highlights the high stakes involved not just for the immediate parties but for global security, underscoring the urgency and gravity of finding a peaceful resolution to the conflict.
These statements are pivotal as they encapsulate the core issues at the heart of the Russia-Ukraine conflict, the geopolitical dynamics with NATO and the West, and the potential paths towards resolution or further escalation.
Part 1: separation and summary
Part 1: separation and summary
In the first part of the interview Putin gave a historical overview of the formation and evolution of the "Ukrainian lands." We can extract the first part of the interview "manually" like this:
In[]:=
my ($part1, $part2) = $txtEN.split('Tucker Carlson: Do you believe Hungary has a right to take back its land from Ukraine?');
say "Part 1 stats: {&text-stats($part1)}";
say "Part 2 stats: {&text-stats($part2)}";
say "Part 1 stats: {&text-stats($part1)}";
say "Part 2 stats: {&text-stats($part2)}";
Alternatively, we can ask ChatGPT to make the extraction for us:
In[]:=
my $splittingQuestion = llm-synthesize([
"Which question by Tucker Carlson splits the following interview into two parts:",
"(1) historical overview Ukraine's formation, and (2) shorter answers.",
$txtEN,
llm-prompt('NothingElse')('the splitting question by Tucker Carlson')
], e => $conf)
"Which question by Tucker Carlson splits the following interview into two parts:",
"(1) historical overview Ukraine's formation, and (2) shorter answers.",
$txtEN,
llm-prompt('NothingElse')('the splitting question by Tucker Carlson')
], e => $conf)
Here is the first part of the interview according to the LLM result:
In[]:=
my $llmPart1 = $txtEN.split($splittingQuestion.substr(10,200)).first;
&text-stats($llmPart1)
&text-stats($llmPart1)
Remark: We can see that LLM "spared" nearly 1/3 of the "manually" selected text. Below we continue with the latter.
Summary of the first part
Summary of the first part
Here is a summary of the first part of the interview:
llm-synthesize(['Summarize the following part one of the Carlson-Putin interview:', $part1], e => $conf);
In the first part of the interview between Tucker Carlson and President Vladimir Putin, Carlson questions Putin about his statement made on February 24, 2022, regarding the conflict in Ukraine . Putin clarifies that he never stated the U . S . would launch a surprise attack on Russia, prompting a discussion on the historical context of Russia - Ukraine relations . Putin traces the origins of the Russian state to 862 and discusses the historical development of the region, including the significance of Kiev and Novgorod, the adoption of Orthodoxy in 988, and the eventual fragmentation and reunification of Russian lands .
Putin explains the historical ties between Russia and Ukraine, mentioning the Polonization efforts in the region and the eventual appeal of the people in parts of what is now Ukraine to Moscow for protection in the 17 th century . He provides a detailed historical account of the territorial changes and political maneuvers over the centuries, including the role of the Bolsheviks in establishing Soviet Ukraine and the allocation of territories post - World War II .
Throughout the interview, Putin emphasizes the historical connections between Russia and Ukraine, arguing that Ukraine as it is known today is an artificial state shaped by Soviet policies . He presents documents to support his claims and discusses the complex history of the region to explain the current conflict . Carlson questions the relevance of this historical context to the present situation and why Putin did not assert these claims earlier in his presidency . Putin responds by highlighting the historical basis for Russia' s position and the creation of Soviet Ukraine with territories that had different historical affiliations .
Putin explains the historical ties between Russia and Ukraine, mentioning the Polonization efforts in the region and the eventual appeal of the people in parts of what is now Ukraine to Moscow for protection in the 17 th century . He provides a detailed historical account of the territorial changes and political maneuvers over the centuries, including the role of the Bolsheviks in establishing Soviet Ukraine and the allocation of territories post - World War II .
Throughout the interview, Putin emphasizes the historical connections between Russia and Ukraine, arguing that Ukraine as it is known today is an artificial state shaped by Soviet policies . He presents documents to support his claims and discusses the complex history of the region to explain the current conflict . Carlson questions the relevance of this historical context to the present situation and why Putin did not assert these claims earlier in his presidency . Putin responds by highlighting the historical basis for Russia' s position and the creation of Soviet Ukraine with territories that had different historical affiliations .
Part 2: thematic parts
Part 2: thematic parts
Here we make an LLM request for finding and distilling the themes or the second part of the interview:
In[]:=
my $llmParts = llm-synthesize([
'Split the following second part of the Tucker-Putin intervew into thematic parts:',
$part2,
"Return the parts as a JSON array.",
llm-prompt('NothingElse')('JSON')
], e => $conf, form => sub-parser('JSON'):drop);
deduce-type($llmParts)
'Split the following second part of the Tucker-Putin intervew into thematic parts:',
$part2,
"Return the parts as a JSON array.",
llm-prompt('NothingElse')('JSON')
], e => $conf, form => sub-parser('JSON'):drop);
deduce-type($llmParts)
Here we tabulate the found themes:
Interview's spoken parts
Interview's spoken parts
In this section we separate the spoken parts of each participant in the interview. We do that using regular expressions, not LLMs.
Here we split the interview text with the names of the participants:
In[]:=
my @parts = $txtEN.split(/ 'Tucker Carlson:' | ['President of Russia' \h+]? 'Vladimir Putin:' /, :v, :skip-empty)>>.Str.rotor(2).map({ Pair.new(|$_) });
say "Total parts {@parts.elems}";
say "First 4:";
.say for @parts[^4];
say "Total parts {@parts.elems}";
say "First 4:";
.say for @parts[^4];
Here we further process the separate participant names and corresponding parts into a list of pairs:
In[]:=
@parts .= map({ ($_.key.contains('Carlson') ?? 'T.Carlson' !! 'V.Putin') => $_.value.trim });
@parts.elems
@parts.elems
Here we get all spoken parts of Tucker Carlson (and consider all of them to be "questions"):
In[]:=
my @tcQuestions = @parts.grep({ $_.key.contains('Carlson') && $_.value ~~ / '?' $/ }).map(*.value);
@tcQuestions.elems
@tcQuestions.elems
A table with all questions:
Search engine
Search engine
In this section we make a (mini) search engines of the interview parts obtained above.
Here are the steps:
1
.Make sure the interview parts are associated with unique identifiers that also identify the speakers
2
.Find the embedding vectors for each part.
3
.Create a recommendation function that:
◼
Filters the embeddings according to specified type
◼
Finds the vector embedding of given query
◼
Finds the dot products of query-vector with the part-vectors
◼
Pick the top results
Here we make a hash-map of the interview parts obtained above:
In[]:=
my $k = 0;
my %parts = @parts.map({ "{$k++} {$_.key}" => $_.value });
%parts.elems
my %parts = @parts.map({ "{$k++} {$_.key}" => $_.value });
%parts.elems
Here we find the LLM embedding vectors of the interview parts:
In[]:=
my %embs = %parts.keys.Array Z=> openai-embeddings(%parts.values, format => 'values', api-key => $openai-auth-key).Array;
%embs.elems;
%embs.elems;
Here is a function to find the most relevant parts of the interview for a given query (using dot product):
In[]:=
sub top-parts(Str $query, UInt $n = 3, :$type is copy = 'answers' ) {
my @vec = |openai-embeddings($query, format=>'values', api-key => $openai-auth-key).head;
if $type.isa(Whatever) { $type = 'part'; }
my %embsLocal = do given $type {
when $_ (elem) <part statement> {
%embs
}
when $_ (elem) <answer answers Putin> {
%embs.grep({ $_.key.contains('Putin') })
}
when $_ (elem) <question questions Carlson Tucker> {
%embs.grep({ $_.key.contains('Carlson') })
}
default {
die "Do not know how to process the $type arugment."
}
}
my @sres = %embsLocal.map({ $_.key => sum($_.value >>*<< @vec) });
@sres .= sort({ - $_.value });
return @sres[^$n].map({ %(Score => $_.value, Text => %parts{$_.key}) }).Array;
}
my @vec = |openai-embeddings($query, format=>'values', api-key => $openai-auth-key).head;
if $type.isa(Whatever) { $type = 'part'; }
my %embsLocal = do given $type {
when $_ (elem) <part statement> {
%embs
}
when $_ (elem) <answer answers Putin> {
%embs.grep({ $_.key.contains('Putin') })
}
when $_ (elem) <question questions Carlson Tucker> {
%embs.grep({ $_.key.contains('Carlson') })
}
default {
die "Do not know how to process the $type arugment."
}
}
my @sres = %embsLocal.map({ $_.key => sum($_.value >>*<< @vec) });
@sres .= sort({ - $_.value });
return @sres[^$n].map({ %(Score => $_.value, Text => %parts{$_.key}) }).Array;
}
Here we find the top 3 results for a query:
Flavored variations
Flavored variations
In this section we show how the spoken parts can be rephrased in the style of certain political celebrities.
Here are examples of using LLM to rephrase Tucker Carlson's questions into the style of Hillary Clinton:
In[]:=
for ^2 {
my $q = @tcQuestions.pick;
say '=' x 100;
say "Tucker Carlson: $q";
say '-' x 100;
my $q2 = llm-synthesize(["Rephrase this question in the style of Hillary Clinton:", $q], e=>$conf);
say "Hilary Clinton: $q2";
}
my $q = @tcQuestions.pick;
say '=' x 100;
say "Tucker Carlson: $q";
say '-' x 100;
my $q2 = llm-synthesize(["Rephrase this question in the style of Hillary Clinton:", $q], e=>$conf);
say "Hilary Clinton: $q2";
}
Here are examples of using LLM to rephrase Vladimir Putin's answers into the style of Donald Trump:
In[]:=
for ^2 {
my $q = @parts.grep({ $_.key.contains('Putin') }).pick.value;
say '=' x 100;
say "Vladimir Putin: $q";
say '-' x 100;
my $q2 = llm-synthesize(["Rephrase this question in the style of Hillary Clinton:", $q], e=>$conf);
say "Donald Trump: $q2";
}
my $q = @parts.grep({ $_.key.contains('Putin') }).pick.value;
say '=' x 100;
say "Vladimir Putin: $q";
say '-' x 100;
my $q2 = llm-synthesize(["Rephrase this question in the style of Hillary Clinton:", $q], e=>$conf);
say "Donald Trump: $q2";
}
Setup
Setup
Packages
Packages
In[]:=
use WWW::OpenAI;
use WWW::PaLM;
use LLM::Functions;
use LLM::Prompts;
use Text::SubParsers;
use Data::TypeSystem;
use Data::Reshapers;
use Text::Plot;
use JavaScript::D3;
use JSON::Fast;
use HTTP::Tiny;
use Mathematica::Serializer;
use WWW::PaLM;
use LLM::Functions;
use LLM::Prompts;
use Text::SubParsers;
use Data::TypeSystem;
use Data::Reshapers;
use Text::Plot;
use JavaScript::D3;
use JSON::Fast;
use HTTP::Tiny;
use Mathematica::Serializer;
Configurations
Configurations
Here we extract OpenAI API key from the OS environment:
Here we extract PaLM API key from the OS environment:
Here we make OpenAI and PaLM default configurations:
In[]:=
my $confOpenAI = llm-configuration('OpenAI', api-key => $openai-auth-key);
my $confPaLM = llm-configuration('PaLM', api-key => $palm-auth-key);
($confOpenAI.name,$confPaLM.name).raku
my $confPaLM = llm-configuration('PaLM', api-key => $palm-auth-key);
($confOpenAI.name,$confPaLM.name).raku
References
References
Articles
Articles
[AA2] Anton Antonov, "Day 21 â Using DALL-E models in Raku" , (2023), Raku Advent Calendar blog for 2023 .
[OAIb1] OpenAI team, "New models and developer products announced at DevDay" , (2023), OpenAI/blog .
Packages
Packages
Videos
Videos
[AAv2] Anton Antonov, "Jupyter Chatbook multi cell LLM chats teaser (Raku)" , (2023), YouTube/@AAA4Prediction .
[AAv3] Anton Antonov "Integrating Large Language Models with Raku" , (2023), YouTube/@therakuconference6823 .