Artificial Intelligence 5 min read

Building an Advertising Recommendation Model with Python and PyTorch

This article walks through the development of a simple advertising recommendation system using Python, covering data collection, preprocessing with label encoding, text embedding via Torch, constructing an MLP model, and initiating training, while reflecting on the challenges faced by Python developers in the big‑data era.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Building an Advertising Recommendation Model with Python and PyTorch

Being a Python developer can feel contradictory: you are immersed in big‑data environments where everyone is "naked swimming" while you also have to analyze and block intrusive ads.

The example starts by gathering a large set of ad‑placement data, including app information, ad slot IDs, media IDs, material details, titles, descriptions, and other vector features.

To handle categorical fields such as pkgname , ver , slotid , mediaid , and material , label encoding is applied to both training and test sets:

for col in ["pkgname", "ver", "slotid", "mediaid", "material"]: lbl = LabelEncoder() lbl.fit(train_df[col].tolist() + test_df[col].tolist()) train_df[col] = lbl.transform(train_df[col]) test_df[col] = lbl.transform(test_df[col])

After encoding, textual features are transformed into vectors using Torch's Embedding layer, converting each categorical value into a dense representation based on the logarithm of its cardinality.

The core model is a multilayer perceptron (MLP) defined with PyTorch. It creates embedding dictionaries for each categorical field, concatenates them with other feature vectors, passes the result through configurable fully‑connected layers, applies ReLU and dropout, and finally outputs a logit:

class MLP(nn.Module): def __init__(self, category_dict, layers=[45 + 240, 32], dropout=False): super().__init__() self.category_dict = category_dict self.embedding_dict = { key: torch.nn.Embedding(self.category_dict[key] + 1, int(np.log2(self.category_dict[key]))) for key in category_dict.keys() } self.fc_layers = torch.nn.ModuleList() for _, (in_size, out_size) in enumerate(zip(layers[:-1], layers[1:])): self.fc_layers.append(torch.nn.Linear(in_size, out_size).to(device)) self.output_layer = torch.nn.Linear(layers[-1], 1).to(device) def forward(self, feed_dict, embed_dict): embedding_feet = {key: self.embedding_dict[key](feed_dict[key]) for key in self.category_dict.keys()} x = torch.cat(list(embedding_feet.values()), 1) x = torch.cat([x, embed_dict], 1) for idx, _ in enumerate(range(len(self.fc_layers))): x = self.fc_layers[idx](x) x = F.relu(x) x = F.dropout(x) logit = self.output_layer(x) return logit

Training is then launched ("Training starts~"), demonstrating a typical workflow for an ad recommendation pipeline, while acknowledging that more sophisticated techniques may be needed for production‑grade systems.

machine learningPythonrecommendation systemembeddingPyTorchMLP
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.