
And more optimism.
Look, I know what you're thinking. "Why not just use Elasticsearch?" or "What about Algolia?" Those are valid options, but they come with complexity. You need to learn their APIs, manage their infrastructure, and deal with their quirks.
Sometimes you just want something that:
That's what I built. A search engine that uses your existing database, respects your current architecture, and gives you full control over how it works.
The concept is simple: tokenize everything, store it, then match tokens when searching.
Here's how it works:
The magic is in the tokenization and weighting. Let me show you what I mean.
We need two simple tables: index_tokens and index_entries.
This table stores all unique tokens with their tokenizer weights. Each token name can have multiple records with different weights—one per tokenizer.
// index_tokens table structure
id | name | weight
---|---------|-------
1 | parser | 20 // From WordTokenizer
2 | parser | 5 // From PrefixTokenizer
3 | parser | 1 // From NGramsTokenizer
4 | parser | 10 // From SingularTokenizer
Why store separate tokens per weight? Different tokenizers produce the same token with different weights. For example, "parser" from WordTokenizer has weight 20, but "parser" from PrefixTokenizer has weight 5. We need separate records to properly score matches.
The unique constraint is on (name, weight), so the same token name can exist multiple times with different weights.
This table links tokens to documents with field-specific weights.
// index_entries table structure
id | token_id | document_type | field_id | document_id | weight
---|----------|---------------|----------|-------------|-------
1 | 1 | 1 | 1 | 42 | 2000
2 | 2 | 1 | 1 | 42 | 500
The weight here is the final calculated weight: field_weight × tokenizer_weight × ceil(sqrt(token_length)). This encodes everything we need for scoring. We will talk about scoring later in the post.
We add indexes on:
(document_type, document_id) - for fast document lookupstoken_id - for fast token lookups(document_type, field_id) - for field-specific queriesweight - for filtering by weightWhy this structure? Simple, efficient, and leverages what databases do best.
What is tokenization? It's breaking text into searchable pieces. The word "parser" becomes tokens like ["parser"], ["par", "pars", "parse", "parser"], or ["par", "ars", "rse", "ser"] depending on which tokenizer we use.
Why multiple tokenizers? Different strategies for different matching needs. One tokenizer for exact matches, another for partial matches, another for typos.
All tokenizers implement a simple interface:
interface TokenizerInterface
{
public function tokenize(string $text): array; // Returns array of Token objects
public function getWeight(): int; // Returns tokenizer weight
}
Simple contract, easy to extend.
This one is straightforward—it splits text into individual words. "parser" becomes just ["parser"]. Simple, but powerful for exact matches.
First, we normalize the text. Lowercase everything, remove special characters, normalize whitespace:
class WordTokenizer implements TokenizerInterface
{
public function tokenize(string $text): array
{
// Normalize: lowercase, remove special chars
$text = mb_strtolower(trim($text));
$text = preg_replace('/[^a-z0-9]/', ' ', $text);
$text = preg_replace('/\s+/', ' ', $text);
Next, we split into words and filter out short ones:
// Split into words, filter short ones
$words = explode(' ', $text);
$words = array_filter($words, fn($w) => mb_strlen($w) >= 2);
Why filter short words? Single-character words are usually too common to be useful. "a", "I", "x" don't help with search.
Finally, we return unique words as Token objects:
// Return as Token objects with weight
return array_map(
fn($word) => new Token($word, $this->weight),
array_unique($words)
);
}
}
Weight: 20 (high priority for exact matches)
This generates word prefixes. "parser" becomes ["par", "pars", "parse", "parser"] (with min length 4). This helps with partial matches and autocomplete-like behavior.
First, we extract words (same normalization as WordTokenizer):
class PrefixTokenizer implements TokenizerInterface
{
public function __construct(
private int $minPrefixLength = 4,
private int $weight = 5
) {}
public function tokenize(string $text): array
{
// Normalize same as WordTokenizer
$words = $this->extractWords($text);
Then, for each word, we generate prefixes from the minimum length to the full word:
$tokens = [];
foreach ($words as $word) {
$wordLength = mb_strlen($word);
// Generate prefixes from min length to full word
for ($i = $this->minPrefixLength; $i <= $wordLength; $i++) {
$prefix = mb_substr($word, 0, $i);
$tokens[$prefix] = true; // Use associative array for uniqueness
}
}
Why use an associative array? It ensures uniqueness. If "parser" appears twice in the text, we only want one "parser" token.
Finally, we convert the keys to Token objects:
return array_map(
fn($prefix) => new Token($prefix, $this->weight),
array_keys($tokens)
);
}
}
Weight: 5 (medium priority)
Why min length? Avoid too many tiny tokens. Prefixes shorter than 4 characters are usually too common to be useful.
This creates character sequences of a fixed length (I use 3). "parser" becomes ["par", "ars", "rse", "ser"]. This catches typos and partial word matches.
First, we extract words:
class NGramsTokenizer implements TokenizerInterface
{
public function __construct(
private int $ngramLength = 3,
private int $weight = 1
) {}
public function tokenize(string $text): array
{
$words = $this->extractWords($text);
Then, for each word, we slide a window of fixed length across it:
$tokens = [];
foreach ($words as $word) {
$wordLength = mb_strlen($word);
// Sliding window of fixed length
for ($i = 0; $i <= $wordLength - $this->ngramLength; $i++) {
$ngram = mb_substr($word, $i, $this->ngramLength);
$tokens[$ngram] = true;
}
}
The sliding window: for "parser" with length 3, we get:
Why this works? Even if someone types "parsr" (typo), we still get "par" and "ars" tokens, which match the correctly spelled "parser".
Finally, we convert to Token objects:
return array_map(
fn($ngram) => new Token($ngram, $this->weight),
array_keys($tokens)
);
}
}
Weight: 1 (low priority, but catches edge cases)
Why 3? Balance between coverage and noise. Too short and you get too many matches, too long and you miss typos.
All tokenizers do the same normalization:
This ensures consistent matching regardless of input format.
We have three levels of weights working together:
field_weight × tokenizer_weight × ceil(sqrt(token_length)))When indexing, we calculate the final weight like this:
$finalWeight = $fieldWeight * $tokenizerWeight * ceil(sqrt($tokenLength));
For example:
10 × 20 × ceil(sqrt(6)) = 10 × 20 × 3 = 600Why use ceil(sqrt())? Longer tokens are more specific, but we don't want weights to blow up with very long tokens. "parser" is more specific than "par", but a 100-character token shouldn't have 100x the weight. The square root function gives us diminishing returns—longer tokens still score higher, but not linearly. We use ceil() to round up to the nearest integer, keeping weights as whole numbers.
You can adjust weights for your use case:
You can see exactly how weights are calculated and adjust them as needed.
The indexing service takes a document and stores all its tokens in the database.
Documents that can be indexed implement IndexableDocumentInterface:
interface IndexableDocumentInterface
{
public function getDocumentId(): int;
public function getDocumentType(): DocumentType;
public function getIndexableFields(): IndexableFields;
}
To make a document searchable, you implement these three methods:
class Post implements IndexableDocumentInterface
{
public function getDocumentId(): int
{
return $this->id ?? 0;
}
public function getDocumentType(): DocumentType
{
return DocumentType::POST;
}
public function getIndexableFields(): IndexableFields
{
$fields = IndexableFields::create()
->addField(FieldId::TITLE, $this->title ?? '', 10)
->addField(FieldId::CONTENT, $this->content ?? '', 1);
// Add keywords if present
if (!empty($this->keywords)) {
$fields->addField(FieldId::KEYWORDS, $this->keywords, 20);
}
return $fields;
}
}
Three methods to implement:
getDocumentType(): returns the document type enumgetDocumentId(): returns the document IDgetIndexableFields(): builds fields with weights using fluent APIYou can index documents:
app:index-document, app:reindex-documentsHere's the indexing process, step by step.
First, we get the document information:
class SearchIndexingService
{
public function indexDocument(IndexableDocumentInterface $document): void
{
// 1. Get document info
$documentType = $document->getDocumentType();
$documentId = $document->getDocumentId();
$indexableFields = $document->getIndexableFields();
$fields = $indexableFields->getFields();
$weights = $indexableFields->getWeights();
The document provides its fields and weights via the IndexableFields builder.
Next, we remove the existing index for this document. This handles updates—if the document changed, we need to reindex it:
// 2. Remove existing index for this document
$this->removeDocumentIndex($documentType, $documentId);
// 3. Prepare batch insert data
$insertData = [];
Why remove first? If we just add new tokens, we'll have duplicates. Better to start fresh.
Now, we process each field. For each field, we run all tokenizers:
// 4. Process each field
foreach ($fields as $fieldIdValue => $content) {
if (empty($content)) {
continue;
}
$fieldId = FieldId::from($fieldIdValue);
$fieldWeight = $weights[$fieldIdValue] ?? 0;
// 5. Run all tokenizers on this field
foreach ($this->tokenizers as $tokenizer) {
$tokens = $tokenizer->tokenize($content);
For each tokenizer, we get tokens. Then, for each token, we find or create it in the database and calculate the final weight:
foreach ($tokens as $token) {
$tokenValue = $token->value;
$tokenWeight = $token->weight;
// 6. Find or create token in index_tokens
$tokenId = $this->findOrCreateToken($tokenValue, $tokenWeight);
// 7. Calculate final weight
$tokenLength = mb_strlen($tokenValue);
$finalWeight = (int) ($fieldWeight * $tokenWeight * ceil(sqrt($tokenLength)));
// 8. Add to batch insert
$insertData[] = [
'token_id' => $tokenId,
'document_type' => $documentType->value,
'field_id' => $fieldId->value,
'document_id' => $documentId,
'weight' => $finalWeight,
];
}
}
}
Why batch insert? Performance. Instead of inserting one row at a time, we collect all rows and insert them in one query.
Finally, we batch insert everything:
// 9. Batch insert for performance
if (!empty($insertData)) {
$this->batchInsertSearchDocuments($insertData);
}
}
The findOrCreateToken method is straightforward:
private function findOrCreateToken(string $name, int $weight): int
{
// Try to find existing token with same name and weight
$sql = "SELECT id FROM index_tokens WHERE name = ? AND weight = ?";
$result = $this->connection->executeQuery($sql, [$name, $weight])->fetchAssociative();
if ($result) {
return (int) $result['id'];
}
// Create new token
$insertSql = "INSERT INTO index_tokens (name, weight) VALUES (?, ?)";
$this->connection->executeStatement($insertSql, [$name, $weight]);
return (int) $this->connection->lastInsertId();
}
}
Why find or create? Tokens are shared across documents. If "parser" already exists with weight 20, we reuse it. No need to create duplicates.
The key points:
The search service takes a query string and finds relevant documents. It tokenizes the query the same way we tokenized documents during indexing, then matches those tokens against the indexed tokens in the database. The results are scored by relevance and returned as document IDs with scores.
Here's the search process, step by step.
First, we tokenize the query using all tokenizers:
class SearchService
{
public function search(DocumentType $documentType, string $query, ?int $limit = null): array
{
// 1. Tokenize query using all tokenizers
$queryTokens = $this->tokenizeQuery($query);
if (empty($queryTokens)) {
return [];
}
If the query produces no tokens (e.g., only special characters), we return empty results.
Different tokenizers produce different token values. If we index with one set and search with another, we'll miss matches.
Example:
The solution: Use the same tokenizers for both indexing and searching. Same tokenization strategy = same token values = complete matches.
This is why the SearchService and SearchIndexingService both receive the same set of tokenizers.
Next, we extract unique token values. Multiple tokenizers might produce the same token value, so we deduplicate:
// 2. Extract unique token values
$tokenValues = array_unique(array_map(
fn($token) => $token instanceof Token ? $token->value : $token,
$queryTokens
));
Why extract values? We search by token name, not by weight. We need the unique token names to search for.
Then, we sort tokens by length (longest first). This prioritizes specific matches:
// 3. Sort tokens (longest first - prioritize specific matches)
usort($tokenValues, fn($a, $b) => mb_strlen($b) <=> mb_strlen($a));
Why sort? Longer tokens are more specific. "parser" is more specific than "par", so we want to search for "parser" first.
We also limit the token count to prevent DoS attacks with huge queries:
// 4. Limit token count (prevent DoS with huge queries)
if (count($tokenValues) > 300) {
$tokenValues = array_slice($tokenValues, 0, 300);
}
Why limit? A malicious user could send a query that produces thousands of tokens, causing performance issues. We keep the longest 300 tokens (already sorted).
Now, we execute the optimized SQL query. The executeSearch() method builds the SQL query and executes it:
// 5. Execute optimized SQL query
$results = $this->executeSearch($documentType, $tokenValues, $limit);
Inside executeSearch(), we build the SQL query with parameter placeholders, execute it, filter low-scoring results, and convert to SearchResult objects:
private function executeSearch(DocumentType $documentType, array $tokenValues, int $tokenCount, ?int $limit, int $minTokenWeight): array
{
// Build parameter placeholders for token values
$tokenPlaceholders = implode(',', array_fill(0, $tokenCount, '?'));
// Build the SQL query (shown in full in "The SQL Query" section below)
$sql = "SELECT sd.document_id, ... FROM index_entries sd ...";
// Build parameters array
$params = [
$documentType->value, // document_type
...$tokenValues, // token values for IN clause
$documentType->value, // for subquery
...$tokenValues, // token values for subquery
$minTokenWeight, // minimum token weight
// ... more parameters
];
// Execute query with parameter binding
$results = $this->connection->executeQuery($sql, $params)->fetchAllAssociative();
// Filter out results with low normalized scores (below threshold)
$results = array_filter($results, fn($r) => (float) $r['score'] >= 0.05);
// Convert to SearchResult objects
return array_map(
fn($result) => new SearchResult(
documentId: (int) $result['document_id'],
score: (float) $result['score']
),
$results
);
}
The SQL query does the heavy lifting: finds matching documents, calculates scores, and sorts by relevance. We use raw SQL for performance and full control—we can optimize the query exactly how we need it.
The query uses JOINs to connect tokens and documents, subqueries for normalization, aggregation for scoring, and indexes on token name, document type, and weight. We use parameter binding for security (prevents SQL injection).
We'll see the full query in the next section.
The main search() method then returns the results:
// 5. Return results
return $results;
}
}
The scoring algorithm balances multiple factors. Let's break it down step by step.
The base score is the sum of all matched token weights:
SELECT
sd.document_id,
SUM(sd.weight) as base_score
FROM index_entries sd
INNER JOIN index_tokens st ON sd.token_id = st.id
WHERE
sd.document_type = ?
AND st.name IN (?, ?, ?) -- Query tokens
GROUP BY sd.document_id
sd.weight: from index_entries (field_weight × tokenizer_weight × ceil(sqrt(token_length)))Why not multiply by st.weight? The tokenizer weight is already included in sd.weight during indexing. The st.weight from index_tokens is used only in the full SQL query's WHERE clause for filtering (ensures at least one token with weight >= minTokenWeight).
This gives us the raw score. But we need more than that.
We add a token diversity boost. Documents matching more unique tokens score higher:
(1.0 + LOG(1.0 + COUNT(DISTINCT sd.token_id))) * base_score
Why? A document matching 5 different tokens is more relevant than one matching the same token 5 times. The LOG function makes this boost logarithmic—matching 10 tokens doesn't give 10x the boost.
We also add an average weight quality boost. Documents with higher quality matches score higher:
(1.0 + LOG(1.0 + AVG(sd.weight))) * base_score
Why? A document with high-weight matches (e.g., title matches) is more relevant than one with low-weight matches (e.g., content matches). Again, LOG makes this logarithmic.
We apply a document length penalty. Prevents long documents from dominating:
base_score / (1.0 + LOG(1.0 + doc_token_count.token_count))
Why? A 1000-word document doesn't automatically beat a 100-word document just because it has more tokens. The LOG function makes this penalty logarithmic—a 10x longer document doesn't get 10x the penalty.
Finally, we normalize by dividing by the maximum score:
score / GREATEST(1.0, max_score) as normalized_score
This gives us a 0-1 range, making scores comparable across different queries.
The full formula looks like this:
SELECT
sd.document_id,
(
SUM(sd.weight) * -- Base score
(1.0 + LOG(1.0 + COUNT(DISTINCT sd.token_id))) * -- Token diversity boost
(1.0 + LOG(1.0 + AVG(sd.weight))) / -- Average weight quality boost
(1.0 + LOG(1.0 + doc_token_count.token_count)) -- Document length penalty
) / GREATEST(1.0, max_score) as score -- Normalization
FROM index_entries sd
INNER JOIN index_tokens st ON sd.token_id = st.id
INNER JOIN (
SELECT document_id, COUNT(*) as token_count
FROM index_entries
WHERE document_type = ?
GROUP BY document_id
) doc_token_count ON sd.document_id = doc_token_count.document_id
WHERE
sd.document_type = ?
AND st.name IN (?, ?, ?) -- Query tokens
AND sd.document_id IN (
SELECT DISTINCT document_id
FROM index_entries sd2
INNER JOIN index_tokens st2 ON sd2.token_id = st2.id
WHERE sd2.document_type = ?
AND st2.name IN (?, ?, ?)
AND st2.weight >= ? -- Ensure at least one token with meaningful weight
)
GROUP BY sd.document_id
ORDER BY score DESC
LIMIT ?
Why the subquery with st2.weight >= ?? This ensures we only include documents that have at least one matching token with a meaningful tokenizer weight. Without this filter, a document matching only low-priority tokens (like n-grams with weight 1) would be included even if it doesn't match any high-priority tokens (like words with weight 20). This subquery filters out documents that only match noise. We want documents that match at least one meaningful token.
Why this formula? It balances multiple factors for relevance. Exact matches score high, but so do documents matching many tokens. Long documents don't dominate, but high-quality matches do.
If no results with weight 10, we retry with weight 1 (fallback for edge cases).
The search service returns SearchResult objects with document IDs and scores:
class SearchResult
{
public function __construct(
public readonly int $documentId,
public readonly float $score
) {}
}
But we need actual documents, not just IDs. We convert them using repositories:
// Perform search
$searchResults = $this->searchService->search(
DocumentType::POST,
$query,
$limit
);
// Get document IDs from search results (preserving order)
$documentIds = array_map(fn($result) => $result->documentId, $searchResults);
// Get documents by IDs (preserving order from search results)
$documents = $this->documentRepository->findByIds($documentIds);
Why preserve order? The search results are sorted by relevance score. We want to keep that order when displaying results.
The repository method handles the conversion:
public function findByIds(array $ids): array
{
if (empty($ids)) {
return [];
}
return $this->createQueryBuilder('d')
->where('d.id IN (:ids)')
->setParameter('ids', $ids)
->orderBy('FIELD(d.id, :ids)') // Preserve order from IDs array
->getQuery()
->getResult();
}
The FIELD() function preserves the order from the IDs array, so documents appear in the same order as search results.
What you get is a search engine that:
Want to add a new tokenizer? Implement TokenizerInterface:
class StemmingTokenizer implements TokenizerInterface
{
public function tokenize(string $text): array
{
// Your stemming logic here
// Return array of Token objects
}
public function getWeight(): int
{
return 15; // Your weight
}
}
Register it in your services configuration, and it's automatically used for both indexing and searching.
Want to add a new document type? Implement IndexableDocumentInterface:
class Comment implements IndexableDocumentInterface
{
public function getIndexableFields(): IndexableFields
{
return IndexableFields::create()
->addField(FieldId::CONTENT, $this->content ?? '', 5);
}
}
Want to adjust weights? Change the configuration. Want to modify scoring? Edit the SQL query. Everything is under your control.
So there you have it. A simple search engine that actually works. It's not fancy, and it doesn't need a lot of infrastructure, but for most use cases, it's perfect.
The key insight? Sometimes the best solution is the one you understand. No magic, no black boxes, just straightforward code that does what it says.
You own it, you control it, you can debug it. And that's worth a lot.
Hi Folks,
This week I want to talk about something that might surprise you: the performance cost of optional chaining in JavaScript. A question came up recently about whether using a noop function pattern is faster than optional chaining, and the answer might make you rethink some of your coding patterns.
After a pull request review I did, Simone Sanfratello created a comprehensive benchmark to verify some of my thinking on this topic, and the results were eye-opening.
Let's start with a simple scenario. You have two approaches to handle optional function calls:
// Approach 1: Noop function
function noop() {}
function testNoop() {
noop();
}
// Approach 2: Optional chaining
const a = {}
function testOptionalChaining() {
a.b?.fn?.();
}
Both accomplish the same goal: they execute safely without throwing errors. But how do they compare performance-wise?
Simone and I ran comprehensive benchmarks with 5 million iterations to get precise measurements. The results were striking:
| Test Case | Ops/Second | Relative to Noop |
|---|---|---|
| Noop Function Call | 939,139,797 | Baseline |
| Optional Chaining (empty object) | 134,240,361 | 7.00x slower |
| Optional Chaining (with method) | 149,748,151 | 6.27x slower |
| Deep Optional Chaining (empty) | 106,370,022 | 8.83x slower |
| Deep Optional Chaining (with method) | 169,510,591 | 5.54x slower |
Yes, you read that right. Noop functions are 5.5x to 8.8x faster than optional chaining operations.
The performance difference comes down to what the JavaScript engine needs to do:
Noop function: Simple function call overhead. The V8 engine optimizes this extremely well - it's just a jump to a known address and back. In fact, V8 will inline trivial functions like noop, making them essentially zero-overhead. The function call completely disappears in the optimized code.
Optional chaining: Property lookup, null/undefined check, potentially multiple checks for chained operations, and then the function call. Each ?. adds overhead that V8 can't optimize away because it has to perform the null/undefined checks at runtime.
The deeper your optional chaining, the worse it gets. Triple chaining like a?.b?.c?.fn?.() is about 1.17x slower than single-level optional chaining.
This is exactly why Fastify uses the abstract-logging module. When no logger is provided, instead of checking logger?.info?.() throughout the codebase, Fastify provides a noop logger object with all the logging methods as noop functions.
// Instead of this everywhere in the code:
server.logger?.info?.('Request received');
server.logger?.error?.('Something went wrong');
// Fastify does this:
const logger = options.logger || require('abstract-logging');
// Now just call it directly:
server.logger.info('Request received');
server.logger.error('Something went wrong');
This is an important technique: provide noops upfront rather than check for existence later. V8 inlines these noop functions, so when logging is disabled, you pay essentially zero cost. The function call is optimized away completely. But if you use optional chaining, you're stuck with the runtime checks every single time, and V8 can't optimize those away.
One of the reasons we see so much unnecessary optional chaining in modern codebases is TypeScript. TypeScript's type system encourages defensive coding by marking properties as potentially undefined, even when your runtime guarantees they exist. This leads developers to add ?. everywhere "just to be safe" and satisfy the type checker.
Consider this common pattern:
interface Config {
hooks?: {
onRequest?: () => void;
}
}
function processRequest(config: Config) {
config.hooks?.onRequest?.(); // Is this really needed?
}
If you know your config object always has hooks defined at runtime, you're paying the optional chaining tax unnecessarily. TypeScript's strictNullChecks pushes you toward this defensive style, but it comes at a performance cost. The type system can't know your runtime invariants, so it forces you to check things that might never actually be undefined in practice.
The solution? Use type assertions or better type modeling when you have runtime guarantees. Here's how:
// Instead of this:
config.hooks?.onRequest?.();
// Do this if you know hooks always exists:
config.hooks!.onRequest?.();
// Or even better, fix the types to match reality:
interface Config {
hooks: {
onRequest?: () => void;
onResponse?: () => void;
}
}
// Now you can write:
config.hooks.onRequest?.();
// Or if you control both and know onRequest exists, use a noop:
const onRequest = config.hooks.onRequest || noop;
onRequest();
Don't let TypeScript's pessimistic type system trick you into defensive code you don't need.
Before you rush to refactor all your optional chaining, let me add some important context:
Even the "slowest" optional chaining still executes at 106+ million operations per second. For most applications, this performance difference is completely negligible. You're not going to notice the difference unless you're doing this in an extremely hot code path.
Memory usage is also identical across both approaches - no concerns there.
Don't premature optimize. Write your code with optional chaining where it makes sense for safety and readability. For most Node.js applications, including web servers and APIs, optional chaining is perfectly fine. The safety and readability benefits far outweigh the performance cost in 99% of cases.
However, noop functions make sense when you're in a performance-critical hot path or every microsecond counts. If you control the code and can guarantee the function exists, skipping the optional chaining overhead is a clear win. Think high-frequency operations, tight loops, or code that runs thousands of times per request. Even at a few thousand calls per request, that 5-8x performance difference starts to add up.
If profiling shows that a specific code path is a bottleneck, then consider switching to noop functions or other optimizations. Use optional chaining for dealing with external data or APIs where you don't control the structure, and use it in normal business logic where code readability and safety are priorities.
Remember: readable, maintainable code is worth more than micro-optimizations in most cases. But when those microseconds matter, now you know the cost.
Thanks to Simone Sanfratello for creating the benchmarks that confirmed these performance characteristics!
I'm worried that they put co-pilot in Excel because Excel is the beast that drives our entire economy and do you know who has tamed that beast?
Brenda.
Who is Brenda?
She is a mid-level employee in every finance department, in every business across this stupid nation and the Excel goddess herself descended from the heavens, kissed Brenda on her forehead and the sweat from Brenda's brow is what allows us to do capitalism. [...]
She's gonna birth that formula for a financial report and then she's gonna send that financial report to a higher up and he's gonna need to make a change to the report and normally he would have sent it back to Brenda but he's like oh I have AI and AI is probably like smarter than Brenda and then the AI is gonna fuck it up real bad and he won't be able to recognize it because he doesn't understand Excel because AI hallucinates.
You know who's not hallucinating?
Brenda.
— @belligerentbarbies, on TikTok