Thread Subject: Neural Networks Cant Generalize..Results error in New Data

Subject: Neural Networks Cant Generalize..Results error in New Data

From: zaheer ahmad

Date: 4 Sep, 2008 12:48:02

Message: 1 of 7

Hi all of you.
i got a problem in Neural Networks.my net doesnt produce
required results when new data( test ) data is applied.it is
100% good in memorization..im not what is the problem, can
any one help/suggest what is the actual problem...the code
goes as below:

clear;clc;

% SET CHARACTERS:
alphabet =Alpha4Train();
targets=TargetSet();%eye(23);%

[Sa,Qa] = size(alphabet);
[S2,Q] = size(targets);
ValidatingChar=Alpha4Test();
TestMem=alphabet(:,77);
TestGen1=ValidatingChar(:,1);
TestGen2=ValidatingChar(:,2);
TestGen3=ValidatingChar(:,3);
% DEFINING THE NETWORK
% ====================
S1 = 100;%120

net = newff(minmax(alphabet),[S1 S2],{'logsig' 'logsig'
'logsig'},'traingdx');%traingdx traingdm trainlm traincgf,
net.LW{2,1} = net.LW{2,1}*0.01;
net.b{2} = net.b{2}*0.01;


net.performFcn = 'sse'; % Sum-Squared Error performance function
net.trainParam.goal = 0.10; % Sum-squared error goal.
net.trainParam.show = 10; % Frequency of progress displays
(in epochs).
net.trainParam.epochs = 5000; % Maximum number of epochs to
train.
net.trainParam.mc = 0.95;%0.65;% % Momentum constant.
mc=0.65 and s1=100 good memorization

% TRAINING THE NETWORK
% ====================

P = alphabet;
T = targets;
[net,tr] = train(net,P,T);

% TRAINING THE NETWORK WITH NOISE.
%
=======================================================================
netn = net;
netn.trainParam.goal = 0.60; % Mean-squared error goal.
netn.trainParam.epochs = 1000;
T = [targets targets targets targets targets targets];
for pass = 1:20
P = [alphabet, alphabet, ...
(alphabet + randn(Sa,Qa)*0.1), ...
(alphabet + randn(Sa,Qa)*0.2), alphabet +
randn(Sa,Qa)*0.3,alphabet];

[netn,trn] = train(netn,P,T);
end

% SIMULATION OF THE NETWORK
% ==========================

Y = sim(netnn,TestMem); % 1 pe &2 te 3 sheen
Yy1 = sim(netnn,TestGen1); % 1 pe &2 te 3 sheen
Yy2 = sim(netnn,TestGen2); % 1 pe &2 te 3 sheen
Yy3 = sim(netnn,TestGen3); % 1 pe &2 te 3 sheen
disp([round(Y) round(Yy1) round(Yy2) round(Yy3)]);

first im training on ideal data then on some noisy data and
then check the results by applying the characters on which i
have trained the network and on some new data but it give
good results on data on which i have trained but wrong
results on new data....simple to say it is 100% good in
memorization but 100% wrong in generalization...how to
improve.....thanks in advance....zaheer ahmad

Subject: Neural Networks Cant Generalize..Results error in New Data

From: Greg Heath

Date: 11 Sep, 2008 07:41:11

Message: 2 of 7

On Sep 4, 8:48=A0am, "zaheer ahmad" <ahmad.zah...@yah00000.com> wrote:
> Hi all of you.
> i got a problem inNeural=A0Networks.my net doesnt produce
> required results when new data( test ) data is applied.it is
> 100% good in memorization..im not what =A0is the problem, can
> any one help/suggest what is the actual problem...the code
> goes as below:
>
> clear;clc;
>
> % SET CHARACTERS:
> alphabet =3DAlpha4Train();
> targets=3DTargetSet();%eye(23);%

Is the empty bracket notation valid?...I've never seen it before.

> [Sa,Qa] =3D size(alphabet);
> [S2,Q] =3D size(targets);

Test to make sure that Q =3D Qa

> ValidatingChar=3DAlpha4Test();
> TestMem=3Dalphabet(:,77);
> TestGen1=3DValidatingChar(:,1);
> TestGen2=3DValidatingChar(:,2);
> TestGen3=3DValidatingChar(:,3);
> % DEFINING THE NETWORK
> % =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> S1 =3D 100;%120 =A0 =A0 =A0 =A0 =A0

How did you determine this value for S1?? See my post on pretraining
advice
.
Google Groups

greg-heath pretraining-advice

Sort by date to find the original post.

Also see my posts on how to choose H (=3DS1.)

greg-heath Neq Nw

> net =3D newff(minmax(alphabet),[S1 =A0S2],{'logsig' 'logsig'
> 'logsig'},'traingdx');%traingdx =A0traingdm =A0trainlm traincgf,

Delete one of the logsigs. You only have two layers of weights.
Do you really want the outputs to be restricted to the open
interval (0,1)? If not, use the default 'purelin' for output.

> net.LW{2,1} =3D net.LW{2,1}*0.01;
> net.b{2} =3D net.b{2}*0.01;

Delete the above. initialization is automatic.

> net.performFcn =3D 'sse'; % Sum-Squared Error performance function

Delete and use the default 'mse'.

> net.trainParam.goal =3D 0.10; % Sum-squared error goal.

Use mean(var(targets))/100 for the mse goal.

> net.trainParam.show =3D 10; % Frequency of progress displays
> (in epochs).
> net.trainParam.epochs =3D 5000; % Maximum number of epochs to
> train.

Use the default of 100 with the default trainlm. If insufficient
increase
to 200 or more.

> net.trainParam.mc =3D 0.95;%0.65;% % Momentum constant.
> mc=3D0.65 and s1=3D100 good memorization

Delete and use defaults of trainlm.

> % TRAINING THE NETWORK
> % =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>
> P =3D alphabet;
> T =3D targets;
> [net,tr] =3D train(net,P,T);
>
> % TRAINING THE NETWORK WITH NOISE.


Why are you doing this??

> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> netn =3D net;
> netn.trainParam.goal =3D 0.60; % Mean-squared error goal.

Before you used 'sse'?

> netn.trainParam.epochs =3D 1000;

Can make it much smaller if you use 'trainlm'

> T =3D [targets targets targets targets targets targets];

Wha?

> for pass =3D 1:20
> P =3D [alphabet, alphabet, ...
> (alphabet + randn(Sa,Qa)*0.1), ...
> (alphabet + randn(Sa,Qa)*0.2), alphabet +
> randn(Sa,Qa)*0.3,alphabet];
>
> [netn,trn] =3D train(netn,P,T);
> end

The above makes no sense. You are training the net 20 times
but doing nothing with it the 1st 19 times...?


> % SIMULATION OF THE NETWORK
> % =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D
>
> Y =3D sim(netnn,TestMem); =A0% 1 pe =A0&2 te 3 sheen =A0
> Yy1 =3D sim(netnn,TestGen1); =A0% 1 pe =A0&2 te 3 sheen =A0
> Yy2 =3D sim(netnn,TestGen2); =A0% 1 pe =A0&2 te 3 sheen =A0
> Yy3 =3D sim(netnn,TestGen3); =A0% 1 pe =A0&2 te 3 sheen =A0
> disp([round(Y) round(Yy1) round(Yy2) =A0round(Yy3)]);
>
> first im training on ideal data then on some noisy data and
> then check the results by applying the characters on which i
> have trained the network and on some new data but it give
> good results on data on which i have trained but wrong
> results on new data....simple to say it is 100% good in
> memorization but 100% wrong in generalization...how to
> improve.....thanks in advance....zaheer ahmad

Partition your data into separate training, validation and test
subsets. The training set is used to estimate weights. The
error on validation set is used to determine network (e.g., S1)
and training algorithm paraneters 9e.g., No. of epochs).
The generalization error is estimated via the error on the test set.

If you want to see how robust the net is you can

1. add random noise to the weights and plot test set error
vs weight noise level
2. add random noise to the test set inputs and plot test set error
vs noise level using the original weights.

Hope this helps.

Greg

Subject: Neural Networks Cant Generalize..Results error in New Data

From: David

Date: 11 Sep, 2008 10:00:05

Message: 3 of 7

Greg Heath <heath@alumni.brown.edu> wrote in message <8436a506-ab26-4400-b349-3e2649bd757c@26g2000hsk.googlegroups.com>...
> On Sep 4, 8:48=A0am, "zaheer ahmad" <ahmad.zah...@yah00000.com> wrote:
> > Hi all of you.
> > i got a problem inNeural=A0Networks.my net doesnt produce
> > required results when new data( test ) data is applied.it is
> > 100% good in memorization..im not what =A0is the problem, can


Besides the technical training suggestion... what do you mean by 'new data'? how is the 'new data' related to the training data? nn's are not great generalizers, they do essentially memorize patterns they are trained with and then take unknown patterns and figure out how they fit into the set of known patterns. i.e. if you train a nn with images for 'A' 'B' 'C' then present it with unknown image 'a' or 'b' it might perform poorly, but is may match up 'c' because it is basically the same shape as 'C'.

Subject: Neural Networks Cant Generalize..Results error in New Data

From: zaheer ahmad

Date: 18 Nov, 2008 19:03:02

Message: 4 of 7

Dear Greg Heath thanks for your reply and sorry for my late response, i was busy in some other projects and didnt add this thread to my watchlist( i think)...
i implemented all your suggestions in my code and tested the code with various angles, but unfortunately i am having same problem.
1. removed one logsig ( no doubt it was not required)
2. brackets () after function name is not required but matlab not warn or give error if written ( i used it by mistake....other languages habit...but now removed )
3. Number of Hidden layers were find out through hit and trial exercise.
4. i had already checked with ' sse ' and traingdm trainlm traincgf but in vain.
5. i was looping ( 20 times ) just to get good result using increased random values( but i agree found unnecessary )
6. from new data i meant test data ( data on which net is not trained but similar to trained data)

My Project is about Urdu OCR...you might know about Urdu its national language of Pakistan..there is no lower case alphabets(atleast not like roman) in it. so be sure i am not trying to match up a,b against A,B....

As stated earlier i have checked/implemented all your suggestions..but i think the problem is not in the training process/code....where? i am not sure....but i have the details of my project so sharing with you:
i am having 54 classes / patterns/alphabets on which trying to train the net....i have trained the net once on 10 samples of each alphabet, checked results but not good, the same way trained the net on 25, samples and then on 50 samples of each pattern i.e. a total of 540,1350,2700 samples . but the results are not encouraging (or say disappointing) and gives same result no matter how many samples are used. note that it converges on number of hidden layer=120 and also on =100.
The main problem i am facing is...the trained net can recognise/match alphabets on which it was trained but can not match similar data on which it was not trained on.

code with amendments is as below:

clear;clc;

% SET CHARACTERS:
alphabet =Alpha4Train;
targets=TargetSet;
[Sa,Qa] = size(alphabet);
[S2,Q] =size(targets);
%%% Q=Qa
% DEFINING THE NETWORK
% ====================
S1 =120 ;

net = newff(minmax(alphabet),[S1 S2],{'logsig' 'logsig'},'traingdx');
% i have checked for trainlm toooo

net.performFcn = 'sse'; % checked on mse tooo
net.trainParam.goal = 0.10;%checked on mean(var(targets))/100;
net.trainParam.show = 20; % Frequency of progress displays (in epochs).
net.trainParam.epochs = 15000; %5000 Maximum number of epochs to train.
net.trainParam.mc = 0.5;%checked on 0.65 and 0.95 too

% TRAINING THE NETWORK

P = alphabet;
T = targets;


[net,tr] = train(net,P,T);

% TRAINING THE NETWORK WITH NOISE...GET DIRTY FOR GOOD RESULTS AT THE END
netn = net;
netn.trainParam.goal = 0.60;
netn.trainParam.epochs = 10000;%500
netn.trainParam.show = 20; %%% Frequency of progress displays (in epochs).
T = [targets targets targets];
P = [(alphabet + randn(Sa,Qa)*0.2), alphabet + randn(Sa,Qa)*0.3,alphabet];
[netn,trn] = train(netn,P,T);


I think i am having problem not in code and training algos but something else.

thanks
zaheer ahmad
ahmad.zaheer (8) yahoo.com

Subject: Neural Networks Cant Generalize..Results error in New Data

From: zaheer ahmad

Date: 20 Nov, 2008 06:05:03

Message: 5 of 7

i think the main problem is in the number of samples......kindly some one reply its urgent...thanks

Subject: Neural Networks Cant Generalize..Results error in New Data

From: Greg Heath

Date: 2 Dec, 2008 22:06:26

Message: 6 of 7

On Nov 18, 2:03 pm, "zaheer ahmad" <ahmad.zah...@yah00000.com> wrote:
> Dear Greg Heath thanks for your reply and sorry for my late response, i was busy in some other projects and didnt add this thread to my watchlist( i think)...
> i implemented all your suggestions in my code and tested the code with various angles, but unfortunately i am having same problem.
> 1. removed one logsig ( no doubt it was not required)
> 2. brackets () after function name is not required but matlab not warn or give error if written ( i used it by mistake....other languages habit...but now removed )
> 3. Number of Hidden layers were find out through hit and trial exercise.
> 4. i had already checked with ' sse ' and traingdm trainlm traincgf but in vain.
> 5. i was looping ( 20 times ) just to get good result using increased random values( but i agree found unnecessary )
> 6. from new data i meant test data ( data on which net is not trained but similar to trained data)
>
> My Project is about Urdu OCR...you might know about Urdu its national language of Pakistan..there is no lower case alphabets(atleast not like roman) in it. so be sure i am not trying to match up a,b against A,B....
>
> As stated earlier i have checked/implemented all your suggestions..but i think the problem is not in the training process/code....where? i am not sure....but i have the details of my project so sharing with you:
> i am having 54 classes / patterns/alphabets on which trying to train the net....i have trained the net once on 10 samples of each alphabet, checked results but not good, the same way trained the net on 25, samples and then on 50 samples of each pattern i.e. a total of 540,1350,2700 samples . but the results are not encouraging (or say disappointing) and gives same result no matter how many samples are used. note that it converges on number of hidden layer=120 and also on =100.
> The main problem i am facing is...the trained net can recognise/match alphabets on which it was trained but can not match similar data on which it was not trained on.

Maybe your classes are not well defined
and have to be partitioned into subclasses
via clustering (e.g., k-means).

Overlay the plot of each misclassified character
(blue) on the plot of the mean of the class to
which they were assigned (red) and the plot of
the mean of the correct class (black)clustering.

This should give some insight into the difficulty.

> code with amendments is as below:
>
> clear;clc;
>
> % SET CHARACTERS:
> alphabet =Alpha4Train;
> targets=TargetSet;
> [Sa,Qa] = size(alphabet);
> [S2,Q] =size(targets);

Well, what are they?

> %%% Q=Qa
> % DEFINING THE NETWORK
> % ====================
> S1 =120 ;

How did you determine this?

> net =newff(minmax(alphabet),[S1 S2],{'logsig' 'logsig'},'traingdx');

Why not standardize the inputs and use
tansig in the hidden layer?

> % i have checked for trainlm toooo
>
> net.performFcn = 'sse'; % checked on mse tooo

Forget sse

> net.trainParam.goal = 0.10;%checked on mean(var(targets))/100;

Looks too high. Probably something wrong.

> net.trainParam.show = 20; % Frequency of progress displays (in epochs).
> net.trainParam.epochs = 15000; %5000 Maximum number of epochs to train.
> net.trainParam.mc = 0.5;%checked on 0.65 and 0.95 too
>
> % TRAINING THE NETWORK
>
> P = alphabet;
> T = targets;
>
> [net,tr] = train(net,P,T);

Where is your error calculation?

> % TRAINING THE NETWORK WITH NOISE...GET DIRTY FOR GOOD RESULTS AT THE END
> netn = net;
> netn.trainParam.goal = 0.60;
> netn.trainParam.epochs = 10000;%500
> netn.trainParam.show = 20; %%% Frequency of progress displays (in epochs).
> T = [targets targets targets];
> P = [(alphabet + randn(Sa,Qa)*0.2), alphabet + randn(Sa,Qa)*0.3,alphabet];

The noise has to be scaled to the standard deviation
of the signal. Have you actually looked at these noisy
characters?

Q ~= Qa.

> [netn,trn] = train(netn,P,T);
>
> I think i am having problem not in code and training algos but something else.

Concentrate on the noiseless characters.

Hope this helps.

Greg

Subject: Neural Networks Cant Generalize..Results error in New Data

From: zaheer ahmad

Date: 6 Dec, 2008 21:00:17

Message: 7 of 7

This problem is rehashed and asked/posted on
http://www.mathworks.com/matlabcentral/newsreader/view_thread/240079#614596
because i was not patientless to get replies from here....sorry for inconvenience....

zaheer ahmad

Tags for this Thread

Everyone's Tags:

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Tag Activity for This Thread
Tag Applied By Date/Time
number of samples zaheer ahmad 20 Nov, 2008 01:10:21
urdu zaheer ahmad 20 Nov, 2008 01:02:13
ocr zaheer ahmad 20 Nov, 2008 01:02:03
newff zaheer ahmad 4 Sep, 2008 08:50:19
generalization zaheer ahmad 4 Sep, 2008 08:50:19
neural networks zaheer ahmad 4 Sep, 2008 08:50:19
mlp zaheer ahmad 4 Sep, 2008 08:50:19
sim zaheer ahmad 4 Sep, 2008 08:50:19
memorization zaheer ahmad 4 Sep, 2008 08:50:19
nn zaheer ahmad 4 Sep, 2008 08:50:19
train zaheer ahmad 4 Sep, 2008 08:50:19
rssFeed for this Thread

Public Submission Policy

NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Disclaimer prior to use.

Contact us at files@mathworks.com