AI Duplicate Content Detector for Symfony Using PHP and OpenAI Embeddings
If you've been running a Symfony-based blog or CMS for a while, chances are you already have duplicate content. You just don't know it yet. Editors rewrite old articles, documentation pages grow organically, and over time you end up with five pages that all basically say the same thing, just worded differently.
The usual approach to catching this, string matching or exact text comparison, falls apart the moment someone changes a few words. Two articles can be 90% the same in meaning and a simple diff won't flag either of them.
That's where OpenAI embeddings come in. Instead of comparing words, we compare meaning. In this tutorial, I'll show you how to build a duplicate content detector in Symfony that uses vector embeddings and cosine similarity to catch semantically similar articles, even when the wording is completely different..
What We're Constructing
After completing this guide, you will have:
- AI-produced embeddings for every article
- A cosine similarity-based semantic similarity checker
- A command for the console to find duplicates
- A threshold for similarity (e.g., 85%+) to mark content
- Any Symfony CMS can be integrated with this foundation.
This is effective for:
- Blogs
- Knowledge bases
- Portals for documentation
- Pages with e-commerce content
Requirements
- Symfony 6 or 7
- PHP 8.1+
- Doctrine ORM
- MySQL / PostgreSQL
- An OpenAI API key
Step 1: Add an Embedding Column to Your Entity
Assume an Article entity.
src/Entity/Article.php
#[ORM\Column(type: 'json', nullable: true)]
private ?array $embedding = null;
public function getEmbedding(): ?array
{
return $this->embedding;
}
public function setEmbedding(?array $embedding): self
{
$this->embedding = $embedding;
return $this;
}
Create and run migration:
php bin/console make:migration
php bin/console doctrine:migrations:migrate
Step 2: Generate Embeddings for Articles
Create a Symfony command:
php bin/console make:command app:generate-article-embeddings
GenerateArticleEmbeddingsCommand.php
namespace App\Command;
use App\Entity\Article;
use Doctrine\ORM\EntityManagerInterface;
use Symfony\Component\Console\Command\Command;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Output\OutputInterface;
class GenerateArticleEmbeddingsCommand extends Command
{
protected static $defaultName = 'app:generate-article-embeddings';
public function __construct(
private EntityManagerInterface $em,
private string $apiKey
) {
parent::__construct();
}
protected function execute(InputInterface $input, OutputInterface $output): int
{
$articles = $this->em->getRepository(Article::class)->findAll();
foreach ($articles as $article) {
if ($article->getEmbedding()) {
continue;
}
$embedding = $this->getEmbedding(
strip_tags($article->getContent())
);
$article->setEmbedding($embedding);
$this->em->persist($article);
$output->writeln("Embedding generated for article ID {$article->getId()}");
}
$this->em->flush();
return Command::SUCCESS;
}
private function getEmbedding(string $text): array
{
$payload = [
'model' => 'text-embedding-3-small',
'input' => mb_substr($text, 0, 4000)
];
$ch = curl_init('https://api.openai.com/v1/embeddings');
curl_setopt_array($ch, [
CURLOPT_RETURNTRANSFER => true,
CURLOPT_HTTPHEADER => [
"Content-Type: application/json",
"Authorization: Bearer {$this->apiKey}"
],
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => json_encode($payload)
]);
$response = curl_exec($ch);
curl_close($ch);
return json_decode($response, true)['data'][0]['embedding'] ?? [];
}
}
Store the API key in .env.local
OPENAI_API_KEY=your_key_here
Step 3: Cosine Similarity Helper
Create a reusable helper.
src/Service/SimilarityService.php
namespace App\Service;
class SimilarityService
{
public function cosine(array $a, array $b): float
{
$dot = 0;
$magA = 0;
$magB = 0;
foreach ($a as $i => $val) {
$dot += $val * $b[$i];
$magA += $val ** 2;
$magB += $b[$i] ** 2;
}
return $dot / (sqrt($magA) * sqrt($magB));
}
}
Step 4: Detect Duplicate Articles
Create another command:
php bin/console make:command app:detect-duplicates
DetectDuplicateContentCommand.php
namespace App\Command;
use App\Entity\Article;
use App\Service\SimilarityService;
use Doctrine\ORM\EntityManagerInterface;
use Symfony\Component\Console\Command\Command;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Output\OutputInterface;
class DetectDuplicateContentCommand extends Command
{
protected static $defaultName = 'app:detect-duplicates';
public function __construct(
private EntityManagerInterface $em,
private SimilarityService $similarity
) {
parent::__construct();
}
protected function execute(InputInterface $input, OutputInterface $output): int
{
$articles = $this->em->getRepository(Article::class)->findAll();
$threshold = 0.85;
foreach ($articles as $i => $a) {
foreach ($articles as $j => $b) {
if ($j <= $i) continue;
if (!$a->getEmbedding() || !$b->getEmbedding()) continue;
$score = $this->similarity->cosine(
$a->getEmbedding(),
$b->getEmbedding()
);
if ($score >= $threshold) {
$output->writeln(
sprintf(
"⚠ Duplicate detected (%.2f): Article %d and %d",
$score,
$a->getId(),
$b->getId()
)
);
}
}
}
return Command::SUCCESS;
}
}
Step 5: Run via Cron (Optional)
To scan regularly, add a cron job:
0 2 * * * php /path/to/project/bin/console app:detect-duplicates
You can store results in a table or send email notifications.
Example Output
Duplicate detected (0.91): Article 12 and 37
Duplicate detected (0.88): Article 18 and 44
Useful Improvements
This system can be expanded with:
- Admin UI for reviewing duplicates
- Canonical page suggestions automatically
- Weighting of the title and excerpt
- Similarity detection at the section level
- Using Messenger for batch processing
- Large-scale vector databases
Cost & Performance Advice
- Create embeddings for each article only once.
- Before embedding, limit the length of the content.
- Ignore the draft content
- Cache similarity findings
- For big datasets, use queues.
AI-Powered Semantic Search in Symfony Using PHP and OpenAI Embeddings
LIKE/MATCH queries have a hard ceiling. I've seen Symfony projects where the client kept complaining that search "doesn't work" and the real issue was never the code, it was that users don't search the way you index. They type "how to reset password" and your database has an article titled "Account Recovery Guide." Zero overlap, zero results.
Switching to OpenAI embeddings fixes this at the architecture level. Instead of matching strings, you convert both the query and your content into vectors and measure how close they are in meaning.
A 1536-dimension float array per article sounds heavy but in practice it's stored as JSON in a text column and the whole thing runs fine on a standard MySQL setup for sites with a few thousand articles.
This tutorial wires it up in Symfony using a console command to generate embeddings and a controller endpoint to run the search. No external vector database needed to get started.
Prerequisites
Before we start, make sure you have:
- Symfony 6 or 7
- PHP 8.1+
- Composer
- A MySQL or SQLite database
- An OpenAI API key
Step 1: Create a New Symfony Command
We’ll use a console command to generate embeddings for your existing content (articles, pages, etc.).
Inside your Symfony project, run:
php bin/console make:command app:generate-embeddings
This will create a new file in src/Command/GenerateEmbeddingsCommand.php.
Replace its contents with the following:
src/Command/GenerateEmbeddingsCommand.php
namespace App\Command;
use Symfony\Component\Console\Attribute\AsCommand;
use Symfony\Component\Console\Command\Command;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Output\OutputInterface;
use Doctrine\ORM\EntityManagerInterface;
use App\Entity\Article;
#[AsCommand(
name: 'app:generate-embeddings',
description: 'Generate AI embeddings for all articles'
)]
class GenerateEmbeddingsCommand extends Command
{
private $em;
private $apiKey = 'YOUR_OPENAI_API_KEY';
private $endpoint = 'https://api.openai.com/v1/embeddings';
public function __construct(EntityManagerInterface $em)
{
$this->em = $em;
parent::__construct();
}
protected function execute(InputInterface $input, OutputInterface $output): int
{
$articles = $this->em->getRepository(Article::class)->findAll();
foreach ($articles as $article) {
$embedding = $this->getEmbedding($article->getContent());
if ($embedding) {
$article->setEmbedding(json_encode($embedding));
$this->em->persist($article);
$output->writeln("✅ Generated embedding for article ID {$article->getId()}");
}
}
$this->em->flush();
return Command::SUCCESS;
}
private function getEmbedding(string $text): ?array
{
$payload = [
'model' => 'text-embedding-3-small',
'input' => $text,
];
$ch = curl_init($this->endpoint);
curl_setopt_array($ch, [
CURLOPT_RETURNTRANSFER => true,
CURLOPT_HTTPHEADER => [
"Content-Type: application/json",
"Authorization: Bearer {$this->apiKey}"
],
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => json_encode($payload)
]);
$response = curl_exec($ch);
curl_close($ch);
$data = json_decode($response, true);
return $data['data'][0]['embedding'] ?? null;
}
}
This command takes every article from the database, sends its content to OpenAI’s Embedding API, and saves the resulting vector in a database field.
Step 2: Update the Entity
Assume your entity is App\Entity\Article.
We’ll add a new column called embedding to store the vector data.
src/Entity/Article.php
#[ORM\Column(type: 'text', nullable: true)]
private ?string $embedding = null;
public function getEmbedding(): ?string
{
return $this->embedding;
}
public function setEmbedding(?string $embedding): self
{
$this->embedding = $embedding;
return $this;
}
Then update your database:
php bin/console make:migration
php bin/console doctrine:migrations:migrate
Step 3: Create a Search Endpoint
We'll now include a basic controller that takes a search query, turns it into an embedding, and determines which article is the most semantically similar.
src/Controller/SearchController.php
namespace App\Controller;
use Symfony\Bundle\FrameworkBundle\Controller\AbstractController;
use Symfony\Component\HttpFoundation\Request;
use Symfony\Component\HttpFoundation\Response;
use Symfony\Component\Routing\Annotation\Route;
use Doctrine\ORM\EntityManagerInterface;
use App\Entity\Article;
class SearchController extends AbstractController
{
private $apiKey = 'YOUR_OPENAI_API_KEY';
private $endpoint = 'https://api.openai.com/v1/embeddings';
#[Route('/search', name: 'ai_search')]
public function search(Request $request, EntityManagerInterface $em): Response
{
$query = $request->query->get('q');
if (!$query) {
return $this->json(['error' => 'Please provide a search query']);
}
$queryVector = $this->getEmbedding($query);
$articles = $em->getRepository(Article::class)->findAll();
$results = [];
foreach ($articles as $article) {
if ($article->getEmbedding()) {
$score = $this->cosineSimilarity(
$queryVector,
json_decode($article->getEmbedding(), true)
);
$results[] = [
'id' => $article->getId(),
'title' => $article->getTitle(),
'similarity' => $score,
];
}
}
usort($results, fn($a, $b) => $b['similarity'] <=> $a['similarity']);
return $this->json(array_slice($results, 0, 5)); // top 5 results
}
private function getEmbedding(string $text): array
{
$payload = [
'model' => 'text-embedding-3-small',
'input' => $text,
];
$ch = curl_init($this->endpoint);
curl_setopt_array($ch, [
CURLOPT_RETURNTRANSFER => true,
CURLOPT_HTTPHEADER => [
"Content-Type: application/json",
"Authorization: Bearer {$this->apiKey}"
],
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => json_encode($payload)
]);
$response = curl_exec($ch);
curl_close($ch);
$data = json_decode($response, true);
return $data['data'][0]['embedding'] ?? [];
}
private function cosineSimilarity(array $a, array $b): float
{
$dot = 0; $magA = 0; $magB = 0;
for ($i = 0; $i < count($a); $i++) {
$dot += $a[$i] * $b[$i];
$magA += $a[$i] ** 2;
$magB += $b[$i] ** 2;
}
return $dot / (sqrt($magA) * sqrt($magB));
}
}
Now, even if the articles don't contain the exact keywords, your /search?q=php framework tutorial endpoint will return those that are most semantically similar to the query.
Step 4: Try It Out
Run the below command.
php bin/console app:generate-embeddings
This generates embeddings for all articles.
Now visit the following URL.
http://your-symfony-app.local/search?q=learn symfony mvc
The top five most pertinent articles will be listed in a JSON response, arranged by meaning rather than keyword.
Real-World Applications
- A more intelligent search within a CMS or knowledge base
- AI-supported matching of FAQs
- Semantic suggestions ("you might also like..."
- Clustering of topics or duplicates in admin panels
Tips for Security and Performance
- Reuse and cache embeddings (avoid making repeated API calls for the same content).
- Keep your API key in.env.local (OPENAI_API_KEY=your_key).
- For better performance, think about using a vector database such as Pinecone, Weaviate, or Qdrant if you have thousands of records.
Steps to create a Contact Form in Symfony With SwiftMailer
In this article, we are going to discuss about How we can create a contact form in Symfony with SwiftMailer. Symfony provides an architecture, components and tools for developers to build complex web applications faster. Choosing symfony allows you to release your applications earlier, host and scale them without problem, and maintain them over time with no surprise.
Swift Mailer is a component based library for sending e-mails from PHP applications. Swift Mailer supports PHP 7.0 to PHP 8.1 included (proc_* functions must be available). Swift Mailer does not work when used with function overloading as implemented by mbstring when mbstring.func_overload is set to 2.
At the end of this article, we will be created a contact form in Symfony using Form Builder and connected with SwiftMailer bundle. So that, the visitor could receive acknowledgement that the messages has been successfully sent.
Step 1: Create Contact Entity with Doctrine
First of all we need to create and configure the database. Database related information and credentials are available in Application Access details panel. Add these to the app/config/parameters.ymlparameters: database_host: localhost database_port: <PORT> database_name: <DB_NAME> database_user: <DB_user_name> database_password: <DB_password>
Next, we need to create a contact entity with Doctrine. To use Doctrine, open SSH terminal and go to your Symfony project:
- cd application/{your_app_folder}/public_html
Now run the following command to start the entity generator:
- php bin/console doctrine:generate:entity
For the contact form I need four fields: name, email, subject, message. You can add more fields as per your requirements. The entity name will be Contact and the shortcut name will be AppBundle:Contact. Annotation will be Yaml, so just press the Enter key on this.
Now go to src/AppBundle/Entity. You will see a Contact.php file in which Doctrine has created a new method for every field. I will use these methods to set values in the Controller:
- <?php
- namespace AppBundle\Entity;
- use Doctrine\ORM\Mapping as ORM
- /**
- * Contact
- *
- * @ORM\Table(name="contact")
- * @ORM\Entity(repositoryClass="AppBundle\Repository\ContactRepository")
- */
- class Contact
- {
- /**
- * @var int
- *
- * @ORM\Column(name="id", type="integer")
- * @ORM\Id
- * @ORM\GeneratedValue(strategy="AUTO")
- */
- private $id;
- /**
- * @var string
- *
- * @ORM\Column(name="name", type="string", length=255)
- */
- private $name;
- /**
- * @var string
- *
- * @ORM\Column(name="email", type="string", length=255)
- */
- private $email;
- /**
- * @var string
- *
- * @ORM\Column(name="subject", type="string", length=255)
- */
- private $subject;
- /**
- * @var string
- *
- * @ORM\Column(name="message", type="string", length=255)
- */
- private $message;
- /**
- * Get id
- *
- * @return int
- */
- public function getId()
- {
- return $this->id;
- }
- /**
- * Set name
- *
- * @param string $name
- *
- * @return Contact
- */
- public function setName($name)
- {
- $this->name = $name;
- return $this;
- }
- /**
- * Get name
- *
- * @return string
- */
- public function getName()
- {
- return $this->name;
- }
- /**
- * Set email
- *
- * @param string $email
- *
- * @return Contact
- */
- public function setEmail($email)
- {
- $this->email = $email;
- return $this;
- }
- /**
- * Get email
- *
- * @return string
- */
- public function getEmail()
- {
- return $this->email;
- }
- /**
- * Set subject
- *
- * @param string $subject
- *
- * @return Contact
- */
- public function setSubject($subject)
- {
- $this->subject = $subject;
- return $this;
- }
- /**
- * Get subject
- *
- * @return string
- */
- public function getSubject()
- {
- return $this->subject;
- }
- /**
- * Set message
- *
- * @param string $message
- *
- * @return Contact
- */
- public function setMessage($message)
- {
- $this->message = $message;
- return $this;
- }
- /**
- * Get message
- *
- * @return string
- */
- public function getMessage()
- {
- return $this->message;
- }
- }
Next, we will work on the Controller and the form View.
Step 2: Create the Form in DefaultController.php
The next step is to create a form in the controller. You can also create a form in the Twig view. However, in this article, I will initialize the form fields using the Symfony’s form builder, and then show the form Widget in the View.
Open DefaultController.php and add the following namesapaces and the entity (created earlier):- namespace AppBundle\Controller;
- use Sensio\Bundle\FrameworkExtraBundle\Configuration\Route;
- use Symfony\Bundle\FrameworkBundle\Controller\Controller;
- use Symfony\Component\HttpFoundation\Request;
- use Symfony\Component\HttpFoundation\Response;
- use Symfony\Component\Form\Extension\Core\Type\TextType;
- use Symfony\Component\Form\Extension\Core\Type\TextareaType;
- use Symfony\Component\Form\Extension\Core\Type\DateTimeType;
- use Symfony\Component\Form\Extension\Core\Type\ChoiceType;
- use Symfony\Component\Form\Extension\Core\Type\SubmitType;
- use Symfony\Component\HttpFoundation\Session\Flash\FlashBag;
- use AppBundle\Entity\Contact;
Now in the createAction() method, create an entity object and pass it to the form builder. The form contains the input fields (as discussed earlier).
- class DefaultController extends Controller
- {
- /**
- * @Route("/form", name="homepage")
- */
- public function createAction(Request $request)
- {
- $contact = new Contact;
- # Add form fields
- $form = $this->createFormBuilder($contact)
- ->add('name', TextType::class, array('label'=> 'name', 'attr' => array('class' => 'form-control', 'style' => 'margin-bottom:15px')))
- ->add('email', TextType::class, array('label'=> 'email','attr' => array('class' => 'form-control', 'style' => 'margin-bottom:15px')))
- ->add('subject', TextType::class, array('label'=> 'subject','attr' => array('class' => 'form-control', 'style' => 'margin-bottom:15px')))
- ->add('message', TextareaType::class, array('label'=> 'message','attr' => array('class' => 'form-control')))
- ->add('Save', SubmitType::class, array('label'=> 'submit', 'attr' => array('class' => 'btn btn-primary', 'style' => 'margin-top:15px')))
- ->getForm();
- # Handle form response
- $form->handleRequest($request);
Step 3: Create View for the Contact Form
To view this form on the Twig template, create a file form.html.twig in app/Resources/views/default and add the form widget to it.
- {% block body %}
- <div class="container">
- <div class="row">
- <div class="col-sm-4">
- <h2 class=page-header>Contact Form in Symfony</h2>
- {{form_start(form)}}
- {{form_widget(form)}}
- {{form_end(form)}}
- </div>
- </div>
- </div>
- {% endblock %}
I have added bootstrap classes to the code. I will now add the bootstrap CDN to base template to make the classes work.
Open the base.html.twig from app/Resources/views and add the CDN links to it.
- <!DOCTYPE html>
- <html>
- <head>
- <meta charset="UTF-8" />
- <title>{% block title %}Welcome!{% endblock %}</title>
- {% block stylesheets %}{% endblock %}
- <link rel="icon" type="image/x-icon" href="{{ asset('favicon.ico') }}" />
- <script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js"></script>
- <link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css" rel="stylesheet">
- </head>
- <body>
- <div class="alert alert-success">
- {% set flashbag_notices = app.session.flashbag.get('notice') %}
- {% if flashbag_notices is not empty %}
- <div class="flash-notice">
- {% for notice in flashbag_notices %}
- {{notice}}
- {% endfor %}
- </div>
- {% endif %}
- {% block body %}{% endblock %}
- </div>
- {% block javascripts %}{% endblock %}
- </body>
- </html>
Now I need to extend form.html.twig with base.html.twig by simply adding the following line at the top.
- {% extends 'base.html.twig' %}
At this point, if you hit the /form Route, you will get the following Form:
Step 4: Save Values in the Database
Next, we will save the values in database. We have already created the form in the DefaultController class. First, we will save the values in variables and then pass these variables to the related methods in the entity. Finally, we will save the variables through the persist() method.
- # check if form is submitted
- if($form->isSubmitted() && $form->isValid()){
- $name = $form['name']->getData();
- $email = $form['email']->getData();
- $subject = $form['subject']->getData();
- $message = $form['message']->getData();
- # set form data
- $contact->setName($name);
- $contact->setEmail($email);
- $contact->setSubject($subject);
- $contact->setMessage($message);
- # finally add data in database
- $sn = $this->getDoctrine()->getManager();
- $sn -> persist($contact);
- $sn -> flush();
Now, when the form is submitted, the values will be available in the database.
Step 5: Send Acknowledgement to the User
It is important to tell the user that their message has been successfully delivered to the website. Many websites send an email that provides all user-submitted data in a proper format.
For this purpose, I will use SwiftMailer, a bundle that comes preinstalled in Symfony 3. To use it, you just need to configure it. To do the necessary changes, open app/config/parameters.yml under parameters
- mailer_transport: gmail
- mailer_host: smtp.gmail.com
- mailer_user: <SMTP_USERNAME>
- mailer_password: <SMTP_PASSWORD>
- secret: Itmaybeanything
- encryption: tls
Now in the config.yml, add the following code to get the values:
- # Swiftmailer Configuration
- swiftmailer:
- transport: "%mailer_transport%"
- host: "%mailer_host%"
- username: "%mailer_user%"
- password: "%mailer_password%"
- spool: { type: memory }
The configuration process is over. Now, open DefaultController.php and add the code for SwiftMailer. I will use the same data variables in this code:
- $message = \Swift_Message::newInstance()
- ->setSubject($subject)
- ->setFrom('shahroznawaz156@gmail.com')
- ->setTo($email)
- ->setBody($this->renderView('default/sendemail.html.twig',array('name' => $name)),'text/html');
- $this->get('mailer')->send($message);
Notice that you can also send email templates. In this code snippet, I sent the email template I created (sendemail.html.twig) in the views/default. This template has a simple test message.
- {# app/Resources/views/Emails/sendemail.html.twig #}
- <h3>You did it</h3>
- Hi {{ name }}! Your Message is successfully Submitted.
- We will get back to you soon!
- Thanks!
Now, when you submit the form, you will get an acknowledgment email in your inbox.
If you have a query or would like to contribute to the discussion, do leave a comment.
No more posts to load.
- Steps to create a Contact Form in Symfony With SwiftMailer
- Building a RAG System in Laravel from Scratch
- Build a WhatsApp AI Assistant Using Laravel, Twilio and OpenAI
- Laravel and Prism PHP: The Modern Way to Work with AI Models
- CIBB - Basic Forum With Codeigniter and Twitter Bootstrap
- Drupal 7 - Create your custom Hello World module
- Build an AI Code Review Bot with Laravel — Real-World Use Case
- Create Front End Component in Joomla - Step by step procedure
- Symfony Framework - Introduction
- A step by step procedure to develop wordpress plugin