One of the most used Artificial Neural Networks models is the Multi-Layer Perceptron, which is capable to fit any function as long as they have enough number of neurons and network layers. The process of obtaining a properly trained Artificial Neural Network usually requires a great effort in determining the parameters that will make it to learn. Currently there are a variety of algorithms for Artificial Neural Networks's training working, simply, in order to minimize the sum of mean square error. However, even if the network reaches the global minimum error, it does not imply that the model response is optimal. Basically, a network with large number of weights but with small amplitudes behaves as an underfitted model that gradually overfits data during training. Solutions that have been overfitting are unnecessary complexity solutions.Moreover, solutions with low norm of the weights are those that present underfitting, with low complexity. The Multi-Objective Algorithmcontrols the weights amplitude by optimizing two objective functions: the error function and norm function. The high generalization capability of the Multi-Objective Algorithm and an automatic weight selection is aggregated by the LASSO approach, which generates networks with reduced number of weights when compared with Multi-Objective Algorithm solutions. Four data sets were chosen in order to compare and evaluate MOBJ, LASSO and Early-Stopping solutions. One generated from a function and tree available from a Machine Learning Repository. Additionally, the MOBJ and LASSO algorithms are applied to a microarray data set, which samples correspond to a genetic expression profile from DNA microarray technology of neoadjuvant chemotherapy (treatment given prior to surgery) for patients with breast cancer. Originally, the dataset is composed of 133 samples with 22283 attributes. By applying e probe section method described in the literature, 30 attributes were selected and used to train the Artificial Neural Networks. In average, the MOBJ and LASSO solutions were the same, the main difference is the simplified topology achieve by LASSO training method.