What do the C and gamma hyperparameters control in an SVM, and how do they relate to overfitting?
C controls the soft-margin tradeoff: large C penalizes misclassifications heavily, producing a narrow margin that can overfit, while small C allows more slack for better generalization. Gamma (for RBF kernels) sets how far one training point's influence reaches: high gamma makes a wiggly boundary that overfits, low gamma makes it smoother. You tune both jointly via cross-validation after scaling features.
How to think about it
The crisp answer
C is the regularization strength governing the soft margin, and gamma (RBF kernel) is the reach of each training point’s influence. Both push the model between underfitting and overfitting from different angles.
What C does
Real data isn’t perfectly separable, so the soft-margin SVM allows some points to violate the margin, with C penalizing those violations. Large C = strongly penalize errors → narrow margin that fits training data tightly → high variance / overfit. Small C = tolerate violations → wider margin → more bias but better generalization. The Analytics Vidhya SVM questions frame C as the core margin-vs-error knob.
What gamma does
Gamma appears in the RBF kernel and sets how localized each point’s influence is. High gamma → influence decays fast → the boundary bends tightly around individual points → overfitting. Low gamma → broad influence → smoother, more general boundary that can underfit.
How they interact
They trade off jointly, so grid- or random-search them together via cross-validation (e.g. C and gamma on log scales). A common pattern is moderate C with moderate gamma; high C with high gamma almost always overfits.
The common trap
Tuning C and gamma without scaling features first — because both the margin and the kernel are distance-based, unscaled features make the search meaningless. Also remember gamma only applies to RBF/poly kernels, not linear. Follow-up: “Which way is overfitting?” — high C and high gamma both increase variance and overfit.