ATTENTION: SSE2 compilation od SZTAKI ?!


Advanced search

Message boards : SZTAKI Desktop Grid : ATTENTION: SSE2 compilation od SZTAKI ?!

AuthorMessage
Profile Piotr Skrodzewicz
Send message
Joined: May 26 06
Posts: 10
Credit: 8,171
RAC: 0
Message 2786 - Posted 29 May 2006 12:48:18 UTC

    Modern CPUs (at least Athlon64, Pentium4) have SSE2 instruction set. This set of instructions allow to greatly speedup of computation.

    There is special version of SETI and BOINC client.

    Performance gain is 40-50% for Athlon64 3000+ Venice.

    My speed on above CPU - SSE2 !!!:
    Measured floating point speed 3317.59 million ops/sec
    Measured integer speed 10803.88 million ops/sec

    Four-processor Powermac (TOP2 computer in SETI):
    Measured floating point speed 7003.36 million ops/sec
    Measured integer speed 20690.23 million ops/sec

    ONLY TWO TIMES FASTER !!!

    ---------

    So please really consider releasing SSE2 version of SZTAKI !!!
    ____________

    akosf
    Avatar
    Send message
    Joined: Aug 30 05
    Posts: 62
    Credit: 510,419
    RAC: 0
    Message 2787 - Posted 29 May 2006 18:13:31 UTC

      Hm...

      I did a \'light-optimised\' version from v1.12.
      Could somebody test it? (sdg11201)

      I didn\'t use any new intstructions so it has to run on 386 too.

      Profile Nightbird
      Forum moderator
      Avatar
      Send message
      Joined: Jul 12 05
      Posts: 920
      Credit: 114,924
      RAC: 0
      Message 2789 - Posted 29 May 2006 21:20:29 UTC

        Last modified: 29 May 2006 21:32:26 UTC

        Installed on an Athlon64 3200+ (Winchester -> SSE2) and on a Barton 3200+ (-> SSE) :)
        ____________

        Profile Ananas
        Send message
        Joined: Jul 12 05
        Posts: 222
        Credit: 665,833
        RAC: 0
        Message 2793 - Posted 29 May 2006 21:56:01 UTC

          Last modified: 29 May 2006 22:55:38 UTC

          Trying on P4/2600 (FSB100), single P3s/1266, PM1600 Banias, OCed XP1800+@2000

          I\'m especially curious about the Banias, it is too slow with the original client, compared to other projects and compared to other boxes running SZTAKI.


          The crunching is faster than my typing ... the P4 has already done one within only 16 minutes, second one needed 20 minutes, still way faster than before (about half the time)

          XP1800 : 25 minutes, before was about 53 (in average)

          No average values yet of course - and nothing to tell about valid or not. Both will show later

          Profile [B^S] thierry@home
          Avatar
          Send message
          Joined: Jul 12 05
          Posts: 98
          Credit: 255,646
          RAC: 0
          Message 2794 - Posted 29 May 2006 21:58:15 UTC

            Last modified: 29 May 2006 22:47:19 UTC

            I\'ve installed it on a P4 3.0 HT. It is running and it seems very faster :-)

            +/- 20 minutes per WU.

            Well, WUs have not the same lenght ..... some are made in 19 minutes, some in 5 minutes, 7 minutes, .....

            It works also very nicely on a Pentium D 2.8.
            ____________

            Profile Ananas
            Send message
            Joined: Jul 12 05
            Posts: 222
            Credit: 665,833
            RAC: 0
            Message 2795 - Posted 29 May 2006 23:03:00 UTC

              Last modified: 29 May 2006 23:13:07 UTC

              Banias 1600 : first result = 21 min (old avg. : 1:05)

              P3s/1266 : First = 0:35 (old avg. : 1:15)

              Profile Rebirther
              Avatar
              Send message
              Joined: Jul 12 05
              Posts: 81
              Credit: 15,472
              RAC: 0
              Message 2796 - Posted 29 May 2006 23:16:22 UTC

                Last modified: 29 May 2006 23:27:08 UTC

                P4 3,2Ghz HT on:
                standard app 29min/30min -->optimized app 25min/12min/16min

                @Ananas: my old values are also over 1h, but I think new WUs have a different length

                Profile Nightbird
                Forum moderator
                Avatar
                Send message
                Joined: Jul 12 05
                Posts: 920
                Credit: 114,924
                RAC: 0
                Message 2797 - Posted 30 May 2006 7:22:58 UTC

                  Last modified: 30 May 2006 7:25:10 UTC

                  Not impossible that a wu on my Athlons needs half time now.
                  ____________

                  Profile Ananas
                  Send message
                  Joined: Jul 12 05
                  Posts: 222
                  Credit: 665,833
                  RAC: 0
                  Message 2798 - Posted 30 May 2006 7:26:13 UTC

                    The Pentium M 1600 (Banias) gets most out of the changes. All WUs it crunched so far have been a lot faster, more than half the time.

                    Some valid results are there too now for all boxes, no invalid results so far.

                    AlexA[boinc.ru]
                    Send message
                    Joined: Mar 16 06
                    Posts: 8
                    Credit: 80,473
                    RAC: 0
                    Message 2802 - Posted 30 May 2006 10:10:32 UTC

                      Perfectly works. My congratulations.
                      ____________

                      Profile Rebirther
                      Avatar
                      Send message
                      Joined: Jul 12 05
                      Posts: 81
                      Credit: 15,472
                      RAC: 0
                      Message 2804 - Posted 30 May 2006 13:34:19 UTC

                        Last modified: 30 May 2006 13:34:37 UTC

                        Now for the same WU and another 3,2Ghz P4 HT on compared with my 3,2Ghz P4 HT on (Northwood)

                        6537,80sec/2427,70sec=63% faster, thats impressive

                        Profile Piotr Skrodzewicz
                        Send message
                        Joined: May 26 06
                        Posts: 10
                        Credit: 8,171
                        RAC: 0
                        Message 2805 - Posted 30 May 2006 14:29:15 UTC

                          You see ? There IS place for optimisation. Try complining:

                          1) MMX
                          2) SSE
                          3) SSE2
                          4) SSE3

                          versions and check results on both P4 and Athlon64.

                          Setibeta has and SIMAP will have special versions depending on CPU instruction sets. If I Recall Correctly - no user intervention needed, BIONC selects and downloads special wersion automagically.
                          ____________

                          [AF>EDLS>Physique] riddim
                          Send message
                          Joined: Oct 29 05
                          Posts: 1
                          Credit: 9,102
                          RAC: 0
                          Message 2807 - Posted 30 May 2006 17:48:33 UTC - in response to Message 2787.

                            Last modified: 30 May 2006 18:01:04 UTC

                            Hm...

                            I did a \'light-optimised\' version from v1.12.
                            Could somebody test it? (sdg11201)

                            I didn\'t use any new intstructions so it has to run on 386 too.




                            I test the application when the 1ere wu will finish i show the informations


                            Sign [AF>Est>IDF>EDLS>Physique] Pas93.Jul

                            ____________

                            Profile [AF>EDLS>Physique] Pas93
                            Send message
                            Joined: Aug 3 05
                            Posts: 14
                            Credit: 65,363
                            RAC: 0
                            Message 2808 - Posted 30 May 2006 18:50:29 UTC - in response to Message 2807.

                              Hm...

                              I did a \'light-optimised\' version from v1.12.
                              Could somebody test it? (sdg11201)

                              I didn\'t use any new intstructions so it has to run on 386 too.




                              I test the application when the 1ere wu will finish i show the informations


                              Sign [AF>Est>IDF>EDLS>Physique] Pas93.Jul



                              This is the numbers:

                              Sempron 3000+ 512mo ram

                              Usual:
                              Times pending
                              3,238.03 7.98

                              Now:
                              Times pending
                              1,308.41 3.22


                              Thank you AKOSF
                              ____________

                              Profile [AF>Linux>IDF] BlackStar95
                              Send message
                              Joined: Apr 3 06
                              Posts: 4
                              Credit: 14,829
                              RAC: 0
                              Message 2809 - Posted 30 May 2006 18:59:47 UTC

                                this \'light-optimised\' version is a BOMB !!

                                thank you ASKOF !
                                ____________

                                Profile [AF>EDLS>Physique] Pas93
                                Send message
                                Joined: Aug 3 05
                                Posts: 14
                                Credit: 65,363
                                RAC: 0
                                Message 2810 - Posted 30 May 2006 20:02:57 UTC - in response to Message 2808.

                                  Hm...

                                  I did a \'light-optimised\' version from v1.12.
                                  Could somebody test it? (sdg11201)

                                  I didn\'t use any new intstructions so it has to run on 386 too.




                                  I test the application when the 1ere wu will finish i show the informations


                                  Sign [AF>Est>IDF>EDLS>Physique] Pas93.Jul



                                  This is the numbers:

                                  Sempron 3000+ 512mo ram

                                  Usual:
                                  Times pending
                                  3,238.03 7.98

                                  Now:
                                  Times pending
                                  1,308.41 3.22


                                  Thank you AKOSF



                                  This is the numbers:
                                  Athlon 2000+ 256 mo ram

                                  Before: 2,941.70 6.90

                                  After: 1,960.23 4.60
                                  ____________

                                  Profile Skip Da Shu
                                  Avatar
                                  Send message
                                  Joined: Jul 12 05
                                  Posts: 64
                                  Credit: 290,501
                                  RAC: 0
                                  Message 2814 - Posted 30 May 2006 21:40:55 UTC

                                    In my \'szdg.lpds.sztaki.hu_szdg\' folder, under \'Projects\', I have this executable:
                                    search_1.12_windows_intelx86.exe 196KB 5/28/2006 4:51am

                                    Is this the one that should be replaced with
                                    search_1.12_windows_intelx86.exe 196KB 5/29/2006 6:56pm
                                    from the sdq11210.zip?

                                    I question this only because the dates are so close.

                                    Thanx, Skip
                                    ____________
                                    - da shu @ HeliOS,
                                    ""A child's exposure to technology should never be predicated on an ability to afford it."

                                    Profile Rebirther
                                    Avatar
                                    Send message
                                    Joined: Jul 12 05
                                    Posts: 81
                                    Credit: 15,472
                                    RAC: 0
                                    Message 2815 - Posted 30 May 2006 21:57:06 UTC

                                      Yepp, thats right ;)

                                      Profile Skip Da Shu
                                      Avatar
                                      Send message
                                      Joined: Jul 12 05
                                      Posts: 64
                                      Credit: 290,501
                                      RAC: 0
                                      Message 2817 - Posted 30 May 2006 22:16:29 UTC - in response to Message 2815.

                                        Yepp, thats right ;)


                                        And is this REALLY a SSE2 app... I accidently installed it on one of my Thorton core XP (SSE but not SSE2)machines and it was running along fine till I stopped it and replaced the optimized back with the original from another machine! Guess I would\'ve expected it to abend rather quickly. ? Thanx, Skip

                                        Profile Rebirther
                                        Avatar
                                        Send message
                                        Joined: Jul 12 05
                                        Posts: 81
                                        Credit: 15,472
                                        RAC: 0
                                        Message 2818 - Posted 30 May 2006 22:31:14 UTC - in response to Message 2817.

                                          Yepp, thats right ;)


                                          And is this REALLY a SSE2 app... I accidently installed it on one of my Thorton core XP (SSE but not SSE2)machines and it was running along fine till I stopped it and replaced the optimized back with the original from another machine! Guess I would\'ve expected it to abend rather quickly. ? Thanx, Skip


                                          Its not SSE2, only some other routines are optimized because older AMD don`t have SSE2 but the version is much better than the original.

                                          Profile Skip Da Shu
                                          Avatar
                                          Send message
                                          Joined: Jul 12 05
                                          Posts: 64
                                          Credit: 290,501
                                          RAC: 0
                                          Message 2819 - Posted 30 May 2006 22:32:30 UTC

                                            5/30/2006 4:26:38 PM|SZTAKI Desktop Grid|Throughput 19541 bytes/sec
                                            5/30/2006 4:27:14 PM|SZTAKI Desktop Grid|Signature verification failed for search_1.12_windows_intelx86.exe
                                            5/30/2006 4:27:14 PM|SZTAKI Desktop Grid|Started download of file search_1.12_windows_intelx86.exe
                                            5/30/2006 4:27:14 PM||Project communication failed: attempting access to reference site
                                            5/30/2006 4:27:15 PM|SZTAKI Desktop Grid|Temporarily failed download of search_1.12_windows_intelx86.exe: http error
                                            5/30/2006 4:27:15 PM|SZTAKI Desktop Grid|Backing off 1 minutes and 0 seconds on download of file search_1.12_windows_intelx86.exe

                                            This is on a Sempron 64 w/ SSE2 & 3

                                            Could this be because this machine may not have ever done any SZTAKI work using search_1.12 before and there for it\'s doing some sort of sig check that it wouldn\'t if it\'d already rx\'d the executable file?

                                            Profile Rebirther
                                            Avatar
                                            Send message
                                            Joined: Jul 12 05
                                            Posts: 81
                                            Credit: 15,472
                                            RAC: 0
                                            Message 2821 - Posted 30 May 2006 22:38:37 UTC - in response to Message 2819.

                                              Last modified: 30 May 2006 22:41:48 UTC

                                              5/30/2006 4:26:38 PM|SZTAKI Desktop Grid|Throughput 19541 bytes/sec
                                              5/30/2006 4:27:14 PM|SZTAKI Desktop Grid|Signature verification failed for search_1.12_windows_intelx86.exe
                                              5/30/2006 4:27:14 PM|SZTAKI Desktop Grid|Started download of file search_1.12_windows_intelx86.exe
                                              5/30/2006 4:27:14 PM||Project communication failed: attempting access to reference site
                                              5/30/2006 4:27:15 PM|SZTAKI Desktop Grid|Temporarily failed download of search_1.12_windows_intelx86.exe: http error
                                              5/30/2006 4:27:15 PM|SZTAKI Desktop Grid|Backing off 1 minutes and 0 seconds on download of file search_1.12_windows_intelx86.exe

                                              This is on a Sempron 64 w/ SSE2 & 3

                                              Could this be because this machine may not have ever done any SZTAKI work using search_1.12 before and there for it\'s doing some sort of sig check that it wouldn\'t if it\'d already rx\'d the executable file?


                                              -->Make a reset at first<--
                                              I had the same problem, the application must be downloaded at first, begin one wu then close boinc and overwrite with the other, start boinc again and all is running fine.

                                              Profile Skip Da Shu
                                              Avatar
                                              Send message
                                              Joined: Jul 12 05
                                              Posts: 64
                                              Credit: 290,501
                                              RAC: 0
                                              Message 2826 - Posted 31 May 2006 3:29:36 UTC

                                                Last modified: 31 May 2006 3:50:18 UTC

                                                Rebirther - You \"the man\" ...with all the answers :-)

                                                Thanx much, I\'ll install it on my socket A barton/thorton core next.

                                                The OC\'d Sempron64s sure like it:

                                                2.75GHz Palermo(128K L2) - ran 2, avg 00:11:45
                                                2.60GHz Palermo(256K L2) - ran 1, avg 00:13:19
                                                2.25GHz Palermo(256K L2) - ran 2, avg 00:18:15
                                                2.25GHz Palermo(128K L2) - ran 2, avg 00:22:24

                                                The Pentium M 1.6G(Banias) came in at 00:25:12 which I believe to be about half it\'s \"normal\" time but I can\'t back that up with actual numbers because I\'d just cleared out my BoincLogX files.

                                                The old PIII .930GHz went from 4,274.36 early today to 1,897.45 on last w/u.

                                                Thanx again for your help, Skip

                                                Profile [AF>Linux>IDF] BlackStar95
                                                Send message
                                                Joined: Apr 3 06
                                                Posts: 4
                                                Credit: 14,829
                                                RAC: 0
                                                Message 2830 - Posted 31 May 2006 16:19:04 UTC

                                                  Askof, when an optimized application for Linux ?
                                                  because I am in a hurry of being able to test it on my computer which is under linux!
                                                  your work is very good!

                                                  thank you !

                                                  ____________

                                                  Profile Nightbird
                                                  Forum moderator
                                                  Avatar
                                                  Send message
                                                  Joined: Jul 12 05
                                                  Posts: 920
                                                  Credit: 114,924
                                                  RAC: 0
                                                  Message 2833 - Posted 31 May 2006 18:17:50 UTC - in response to Message 2830.

                                                    Askof, when an optimized application for Linux ?
                                                    because I am in a hurry of being able to test it on my computer which is under linux!
                                                    your work is very good!

                                                    thank you !

                                                    AkosF, explained himself on Simap and Einstein boards that he only worked on Windows.
                                                    ____________

                                                    Profile [AF>Linux>IDF] BlackStar95
                                                    Send message
                                                    Joined: Apr 3 06
                                                    Posts: 4
                                                    Credit: 14,829
                                                    RAC: 0
                                                    Message 2834 - Posted 31 May 2006 18:26:18 UTC

                                                      Ok i\'m sorry !
                                                      ____________

                                                      Profile Nightbird
                                                      Forum moderator
                                                      Avatar
                                                      Send message
                                                      Joined: Jul 12 05
                                                      Posts: 920
                                                      Credit: 114,924
                                                      RAC: 0
                                                      Message 2840 - Posted 31 May 2006 22:30:42 UTC

                                                        Last modified: 31 May 2006 22:31:06 UTC

                                                        Well, it would be probably possible to incorporate in an optimized version for Linux AkosF\'s ideas, i guess.
                                                        ____________

                                                        akosf
                                                        Avatar
                                                        Send message
                                                        Joined: Aug 30 05
                                                        Posts: 62
                                                        Credit: 510,419
                                                        RAC: 0
                                                        Message 2842 - Posted 31 May 2006 23:04:27 UTC

                                                          Last modified: 31 May 2006 23:09:44 UTC

                                                          Hi people!

                                                          Somebody asked me about this half-optimized version. This code didn\'t use SSE2.
                                                          A well optimized SSE2 code would be much more faster (at least 5-6 times than original).

                                                          ... about sourcecode. I don\'t need it, so i don\'t have it and i don\'t want to have it.

                                                          ... about Linux. I\'m sorry, but no time for Linux. I\'m not a programmer.

                                                          vonHalenbach
                                                          Send message
                                                          Joined: May 31 06
                                                          Posts: 25
                                                          Credit: 461
                                                          RAC: 0
                                                          Message 2846 - Posted 1 Jun 2006 8:00:14 UTC - in response to Message 2840.

                                                            Well, it would be probably possible to incorporate in an optimized version for Linux AkosF\'s ideas, i guess.


                                                            Maybe Bernd Machenschalk from project Einstein@home can do that if we ask him kindly.
                                                            ____________

                                                            Profile Piotr Skrodzewicz
                                                            Send message
                                                            Joined: May 26 06
                                                            Posts: 10
                                                            Credit: 8,171
                                                            RAC: 0
                                                            Message 2849 - Posted 1 Jun 2006 11:33:58 UTC

                                                              Somebody asked me about this half-optimized version. This code didn\'t use SSE2.
                                                              A well optimized SSE2 code would be much more faster (at least 5-6 times than original).

                                                              ... about sourcecode. I don\'t need it, so i don\'t have it and i don\'t want to have it.

                                                              ... about Linux. I\'m sorry, but no time for Linux. I\'m not a programmer.


                                                              If above is true, how did you optimized SZTAKI ?

                                                              And most important question - if indeed SSE2 version will be 5-6x faster - WHY STILL THERE ISN`T SSE2 version ?
                                                              ____________

                                                              akosf
                                                              Avatar
                                                              Send message
                                                              Joined: Aug 30 05
                                                              Posts: 62
                                                              Credit: 510,419
                                                              RAC: 0
                                                              Message 2850 - Posted 1 Jun 2006 13:23:49 UTC - in response to Message 2849.

                                                                If above is true, how did you optimized SZTAKI ?

                                                                On a different much more difficult way.

                                                                And most important question - if indeed SSE2 version will be 5-6x faster - WHY STILL THERE ISN`T SSE2 version ?

                                                                Probably nobody did SSE2 optimization for SZTAKI.

                                                                Rakarin
                                                                Avatar
                                                                Send message
                                                                Joined: Feb 4 06
                                                                Posts: 17
                                                                Credit: 46,513
                                                                RAC: 0
                                                                Message 2852 - Posted 1 Jun 2006 13:41:18 UTC

                                                                  Last modified: 1 Jun 2006 13:42:00 UTC

                                                                  My review has good and bad points.

                                                                  Good: The speed is amazing. I am running BOINC on an AMD Athlon XP 3200+. The work units were taking about 40 minutes. They are now taking less than 15 minutes. I have been watching my credits. All reported success. Since I run SZTAKI on a Windows PC, Linux PC, and Mac, I do not know which units given credit are from which computer. I will watch my results.

                                                                  Bad: It looks like there is a small problem with the checkpoint. I was just watching my BOINC work units. I was trying to figure the speed for the new SZTAKI and SIMAP engines. While I was watching, SZTAKI was running first. Then, I saw BOINC change from SZTAKI to LHC@home. When it changed, the progress meter dropped from sixty-something (I did not see the exact number) to 50.000%, and the CPU time dropped from about 8 minutes to 5 minutes and some seconds (I did not notice exactly).

                                                                  Is the checkpoint in the optimized client only at 50%?

                                                                  Thank you for your work. The increased speed is very, very nice.

                                                                  ____________

                                                                  akosf
                                                                  Avatar
                                                                  Send message
                                                                  Joined: Aug 30 05
                                                                  Posts: 62
                                                                  Credit: 510,419
                                                                  RAC: 0
                                                                  Message 2853 - Posted 1 Jun 2006 13:51:35 UTC - in response to Message 2852.

                                                                    Last modified: 1 Jun 2006 13:52:12 UTC

                                                                    Bad: It looks like there is a small problem with the checkpoint. I was just watching my BOINC work units. I was trying to figure the speed for the new SZTAKI and SIMAP engines. While I was watching, SZTAKI was running first. Then, I saw BOINC change from SZTAKI to LHC@home. When it changed, the progress meter dropped from sixty-something (I did not see the exact number) to 50.000%, and the CPU time dropped from about 8 minutes to 5 minutes and some seconds (I did not notice exactly).

                                                                    Is the checkpoint in the optimized client only at 50%?

                                                                    I changed only the most used calculation routine, so this problem comes from other things.

                                                                    The increased speed is very, very nice.

                                                                    Thanks. It was the result of 13 minutes work.

                                                                    Profile Piotr Skrodzewicz
                                                                    Send message
                                                                    Joined: May 26 06
                                                                    Posts: 10
                                                                    Credit: 8,171
                                                                    RAC: 0
                                                                    Message 2855 - Posted 1 Jun 2006 16:14:17 UTC - in response to Message 2850.

                                                                      Last modified: 1 Jun 2006 16:15:35 UTC

                                                                      And most important question - if indeed SSE2 version will be 5-6x faster - WHY STILL THERE ISN`T SSE2 version ?

                                                                      Probably nobody did SSE2 optimization for SZTAKI.


                                                                      Well, I expected answer from guys responsible for coding/compiling SZTAKI. They should care more about SZTAKI participants:

                                                                      1) release of source code for public OR for closed group of programmers only - AND (allow to) optimize this code.

                                                                      2) release of platform-specific versions - MMX/SSE/SSE2 and eventually SSE3.
                                                                      ____________

                                                                      akosf
                                                                      Avatar
                                                                      Send message
                                                                      Joined: Aug 30 05
                                                                      Posts: 62
                                                                      Credit: 510,419
                                                                      RAC: 0
                                                                      Message 2856 - Posted 1 Jun 2006 16:47:24 UTC - in response to Message 2855.

                                                                        Last modified: 1 Jun 2006 17:00:09 UTC

                                                                        Well, I expected answer from guys responsible for coding/compiling SZTAKI. They should care more about SZTAKI participants:

                                                                        1) release of source code for public OR for closed group of programmers only - AND (allow to) optimize this code.

                                                                        You are right that two heads are better than one, but don\'t forget two things.
                                                                        1, Code of SZTAKI doesn\'t need more heads because it is a very simple program.
                                                                        2, Safety of the code and the BOINC users!

                                                                        2) release of platform-specific versions - MMX/SSE/SSE2 and eventually SSE3.

                                                                        I can say you that sometimes a clever programming gives better results than different compiled versions.

                                                                        Only SSE2 or SSE3 compilation would be good for SZTAKI, and i think a simple native compilation would be able to improve the speed only with 0-3%. Not more.
                                                                        MMX and SSE aren\'t good for SZTAKI. But i did an other very-very small modification ( about a newer 10 minutes work ) and it improved the speed with about 70%.

                                                                        You can download from here: sdg11202

                                                                        speeds with the same wu:
                                                                        original: 3181 sec (speed: 1.00)
                                                                        1.12.01: 1266 sec (speed: 2.51)
                                                                        1.12.02: 750 sec (speed: 4.24)

                                                                        edit: This code also run on all 386 compatible windows machine.

                                                                        edit2: This code works only with the current 12th dimension WUs, but i think it isn\'t problem for you.

                                                                        Profile Rebirther
                                                                        Avatar
                                                                        Send message
                                                                        Joined: Jul 12 05
                                                                        Posts: 81
                                                                        Credit: 15,472
                                                                        RAC: 0
                                                                        Message 2857 - Posted 1 Jun 2006 17:09:00 UTC

                                                                          @akosf: Keep also an eye on malaria/WCG/LHC, we respect all your hard work ;)

                                                                          Profile Rebirther
                                                                          Avatar
                                                                          Send message
                                                                          Joined: Jul 12 05
                                                                          Posts: 81
                                                                          Credit: 15,472
                                                                          RAC: 0
                                                                          Message 2858 - Posted 1 Jun 2006 19:01:20 UTC

                                                                            Last modified: 1 Jun 2006 19:10:58 UTC

                                                                            First results: 3,2Ghz P4 Northwood HT on
                                                                            9min/20min
                                                                            compared with an AMD X2 4400/my P4: 1998,83sec/757,91sec (3x faster with sdg11202)

                                                                            Nice work again!

                                                                            Rakarin
                                                                            Avatar
                                                                            Send message
                                                                            Joined: Feb 4 06
                                                                            Posts: 17
                                                                            Credit: 46,513
                                                                            RAC: 0
                                                                            Message 2861 - Posted 2 Jun 2006 4:18:06 UTC - in response to Message 2857.

                                                                              @akosf: Keep also an eye on malaria/WCG/LHC, we respect all your hard work ;)


                                                                              I\'m sorry, but I would recommend to not touch WCG\'s Fight AIDS @Home or LHC. FA@H studies chemical bonding to virus coat proteins by known anti-viral agents. LHC is creating data to build a particle collider. Each of those would have *no* tolerance for errors. FA@H involves direct medical care, and LHC errors can cause problems in equipment that costs more than the national income of some countries. Because of this, they could be unfriendly to the idea of someone outside their own developers working on their program or potentially altering results. You would also not want any problems to point to your, or even possibly point to you.

                                                                              If you speak with them and show them your results *before* you release anything, you may be able to work with it.

                                                                              This is just friendly advice.
                                                                              ____________

                                                                              Honza
                                                                              Send message
                                                                              Joined: Aug 14 05
                                                                              Posts: 26
                                                                              Credit: 13,511
                                                                              RAC: 0
                                                                              Message 2862 - Posted 2 Jun 2006 7:34:11 UTC

                                                                                I\'m about to disagree with post below in some aspects.
                                                                                First, the part where I agree. I also think that validity of results is on first place. Running CPDN *classic* before public beta was launched 3+ years ago, we were always concerned with validity of results, overclocking etc.
                                                                                As CPDN is most complex aplication under BOINC, with many iterations/calculation, stressing CPU, RAM and HD (whole system), small deviation may lead to highly biased results (among others). We have run some many tests and comparing results. One of them was AMD vs Intel results. Guess what? AMD CPU gives slightly different numbers than Intel (not overclocking, same app, same OS, stable machine etc.).
                                                                                We were not lucky on CPDN with optimalization so far...nor SSE, not first 64-bit test 2 years ago. I\'m not saying it is not possible but needs proper compilers/libs for very complex code developed over decades, long-term testing and for sure time, knowledge and money.

                                                                                But project with simple or simplier code were lucky. They were lucky to make it running 1x, 2.5x or even 5 x faster - which usually reveals poor coding and compilation (you can read waste of CPU cycles on many forums from many users).
                                                                                And some project got even more lucky with optimalizations - it revealed weak spots that cripple not only effeciency, but also precision. To put it simple - optimalized code can be even more accurate than standard aplication!

                                                                                Benefit of optimalization can be manyfold - see SETI Enhanced (where optimlazied code was implemented to official aplication leading to be more effecient yet more precise), Einstein (i believe with same benefit).

                                                                                WCG with outdated Rosetta application is not much different that Predictor with their outdated apps (and dying forum...does whole project doing down?), SIMAP, always delevoping a leading Rosetta or other life-science projects. Their are all simulations with probabilistic/statistic results. It\'s not 0 or 1. It\'s not that 1 results among billion is correct and other are useless like disitrbuted.net
                                                                                LHC is also \"only\" simulation with probabilistic results.

                                                                                How do they deal with errors in CPUs?
                                                                                How do they deal with hard overclocking that leads to errors in calculation
                                                                                How do they deal with bad DDR modules, overheating, bad power supply, poor voltage regulators etc that leads to bit of errors there or over there?
                                                                                They would have to accept results only for properly tested machines with ECC memory, running on UPS etc.

                                                                                So, think it over. Machines with standard/slow application can provide different numbers due to many resonts above. Optimalized application doesn\'t automatically means crippled results as you may suggest. Optimalization doesn\'t mean truncating number to make the calculation faster. Newer instruction sets may benefit from longer registers hence better precision than outdated code so in the results, it may be both faster and more precise!
                                                                                ____________

                                                                                akosf
                                                                                Avatar
                                                                                Send message
                                                                                Joined: Aug 30 05
                                                                                Posts: 62
                                                                                Credit: 510,419
                                                                                RAC: 0
                                                                                Message 2866 - Posted 2 Jun 2006 9:23:19 UTC

                                                                                  sdg11203

                                                                                  time: 499
                                                                                  ratio: 6.37

                                                                                  Profile Rebirther
                                                                                  Avatar
                                                                                  Send message
                                                                                  Joined: Jul 12 05
                                                                                  Posts: 81
                                                                                  Credit: 15,472
                                                                                  RAC: 0
                                                                                  Message 2872 - Posted 2 Jun 2006 10:38:47 UTC

                                                                                    sdg11203: 64% speedup vs standard app (P4 3,2Ghz Northwood HT on), only 1% faster compared with sdg11202

                                                                                    Profile Piotr Skrodzewicz
                                                                                    Send message
                                                                                    Joined: May 26 06
                                                                                    Posts: 10
                                                                                    Credit: 8,171
                                                                                    RAC: 0
                                                                                    Message 2874 - Posted 2 Jun 2006 11:38:51 UTC - in response to Message 2856.

                                                                                      Last modified: 2 Jun 2006 11:40:43 UTC

                                                                                      You are right that two heads are better than one, but don\'t forget two things.
                                                                                      1, Code of SZTAKI doesn\'t need more heads because it is a very simple program.
                                                                                      2, Safety of the code and the BOINC users!


                                                                                      1. If yes, this one HEAD should focus on optimization IMHO.
                                                                                      2. I agree, for example: official max-optimized version will be good - users will not download 3rd party version of SZTAKI (potential backdoor problem).


                                                                                      I can say you that sometimes a clever programming gives better results than different compiled versions.

                                                                                      Only SSE2 or SSE3 compilation would be good for SZTAKI, and i think a simple native compilation would be able to improve the speed only with 0-3%. Not more.
                                                                                      MMX and SSE aren\'t good for SZTAKI. But i did an other very-very small modification ( about a newer 10 minutes work ) and it improved the speed with about 70%.


                                                                                      If you say, that MMX/SSE will not help... OK. Also SSE2 provides better results than SSE3 (with SETI). So why not give SSE2 a try ?

                                                                                      ____________

                                                                                      akosf
                                                                                      Avatar
                                                                                      Send message
                                                                                      Joined: Aug 30 05
                                                                                      Posts: 62
                                                                                      Credit: 510,419
                                                                                      RAC: 0
                                                                                      Message 2875 - Posted 2 Jun 2006 13:00:05 UTC - in response to Message 2874.

                                                                                        Last modified: 2 Jun 2006 13:11:33 UTC

                                                                                        1. If yes, this one HEAD should focus on optimization IMHO.

                                                                                        You are right, that the optimization is very important when you want to do a time and cost efficient system, but the optimization is always one of the last steps. The science and safety have to come before it.
                                                                                        Fast wrong results would not be too useful.

                                                                                        2. I agree, for example: official max-optimized version will be good - users will not download 3rd party version of SZTAKI (potential backdoor problem).

                                                                                        What do you mean by max-optimized version?
                                                                                        You can find always a better solution.
                                                                                        It depends only on your knowledge.

                                                                                        If you say, that MMX/SSE will not help... OK. Also SSE2 provides better results than SSE3 (with SETI). So why not give SSE2 a try ?

                                                                                        I think an SSE2 compiled code would be possible too... but there are two interesting solutions.
                                                                                        1, Different versions in different executables
                                                                                        2, Different versions (fpu/mmx/3dnow/sse/sse2/sse/...) in the same executable
                                                                                        The standard BOINC recommendation is the second solution, so it would mean a much bigger application file that would be a little bit faster on some computers.

                                                                                        SSE2 is a part of SSE3, and SSE2 can\'t beat itself.
                                                                                        The SSE3 optimized SETI is slower because of an other thing.

                                                                                        Honza
                                                                                        Send message
                                                                                        Joined: Aug 14 05
                                                                                        Posts: 26
                                                                                        Credit: 13,511
                                                                                        RAC: 0
                                                                                        Message 2877 - Posted 2 Jun 2006 15:11:57 UTC

                                                                                          Have to wait for slow machines and those with large WU cache...so far, couple WUs with succes. But this one is strange - http://szdg.lpds.sztaki.hu/szdg/workunit.php?wuid=819587
                                                                                          All success, no errors, but mine got 0 credit - not that I care much about it but I wonder what the validator thinks about this particular WU.
                                                                                          ____________

                                                                                          Profile Rebirther
                                                                                          Avatar
                                                                                          Send message
                                                                                          Joined: Jul 12 05
                                                                                          Posts: 81
                                                                                          Credit: 15,472
                                                                                          RAC: 0
                                                                                          Message 2878 - Posted 2 Jun 2006 15:22:48 UTC - in response to Message 2877.

                                                                                            Have to wait for slow machines and those with large WU cache...so far, couple WUs with succes. But this one is strange - http://szdg.lpds.sztaki.hu/szdg/workunit.php?wuid=819587
                                                                                            All success, no errors, but mine got 0 credit - not that I care much about it but I wonder what the validator thinks about this particular WU.


                                                                                            Look here
                                                                                            Dont know if the application has problems...

                                                                                            Profile Nightbird
                                                                                            Forum moderator
                                                                                            Avatar
                                                                                            Send message
                                                                                            Joined: Jul 12 05
                                                                                            Posts: 920
                                                                                            Credit: 114,924
                                                                                            RAC: 0
                                                                                            Message 2881 - Posted 2 Jun 2006 23:01:34 UTC

                                                                                              Last modified: 2 Jun 2006 23:32:29 UTC

                                                                                              If you say, that MMX/SSE will not help... OK.

                                                                                              Why ?
                                                                                              And you need a modern Cpu to get SSE2 instructions.
                                                                                              Even my Barton 3200+ which is a very good performer doesn\'t support them.

                                                                                              What about 3DNow(+) ?

                                                                                              ____________

                                                                                              Odysseus
                                                                                              Avatar
                                                                                              Send message
                                                                                              Joined: Feb 27 06
                                                                                              Posts: 212
                                                                                              Credit: 221,397
                                                                                              RAC: 0
                                                                                              Message 2882 - Posted 2 Jun 2006 23:25:45 UTC - in response to Message 2881.

                                                                                                Last modified: 2 Jun 2006 23:32:17 UTC

                                                                                                If you say, that MMX/SSE will not help... OK.

                                                                                                Why ?

                                                                                                AIUI the SZTAKI app is doing \"discrete mathematics\", where all the numbers concerned (coefficients of polynomials?) are integers. Now I\'m far from knowledgeable about processor architecture, but I gather that the main purpose of CPU enhancements like SSE, MMX, 3DNow, and AltiVec is to increase the speed of floating-point calculations by shifting some of the work onto the quicker components of the processor that normally perform integer operations, in effect \'parallelizing\' these functions by breaking them down into segments that can be performed simultaneously instead of consecutively. For applications that do a lot of trig, logs, and so on they can be very helpful, but they provide little, if any, benefit to integer computations. (No doubt this description is overly simplistic ...)
                                                                                                ____________

                                                                                                akosf
                                                                                                Avatar
                                                                                                Send message
                                                                                                Joined: Aug 30 05
                                                                                                Posts: 62
                                                                                                Credit: 510,419
                                                                                                RAC: 0
                                                                                                Message 2883 - Posted 3 Jun 2006 5:32:04 UTC - in response to Message 2881.

                                                                                                  Last modified: 3 Jun 2006 6:13:25 UTC

                                                                                                  If you say, that MMX/SSE will not help... OK.

                                                                                                  Why ?
                                                                                                  What about 3DNow(+) ?

                                                                                                  So, SZTAKI uses long numbers. These have lots of good digits and long calculation sequences.

                                                                                                  example number: 9*9*9*9*9*9*9*9*9*9*9*9*9*9=22876792454961

                                                                                                  FPU: 64-bit float: 15,3 good digits -> 22876792454961 (OK)
                                                                                                  FPU: 80-bit float: 19,2 good digits -> 22876792454961 (OK, but very slow)
                                                                                                  MMX: 32-bit integer: 9,5 good digits -> 42949667295 (cannot handle value)
                                                                                                  3DNow!: 32-bit float: 6,3 good digits -> 22876793077760 (lower bits lost!)
                                                                                                  SSE: 32-bit float: 6,3 good digits -> 22876793077760 (lower bits lost!)
                                                                                                  SSE2: 64-bit float: 15,3 good digits -> 22876792454961 (OK)
                                                                                                  SSE3: doesn\'t have new format ( new SSE2 instructions )

                                                                                                  So, only FPU, SSE2 and 64-bit mode are good for SZTAKI, but the 64-bit integer arithmetic is very very slow, so forget that fast.

                                                                                                  Profile Piotr Skrodzewicz
                                                                                                  Send message
                                                                                                  Joined: May 26 06
                                                                                                  Posts: 10
                                                                                                  Credit: 8,171
                                                                                                  RAC: 0
                                                                                                  Message 2887 - Posted 3 Jun 2006 12:23:00 UTC

                                                                                                    Last modified: 3 Jun 2006 12:30:57 UTC

                                                                                                    You see ? SSE2 could help a bit :)

                                                                                                    More info about SSE2:
                                                                                                    http://en.wikipedia.org/wiki/SSE2

                                                                                                    BTW. Can P4 use FPU/SSE2 at the same time, so it will be faster than using SSE2 only ?
                                                                                                    ____________

                                                                                                    akosf
                                                                                                    Avatar
                                                                                                    Send message
                                                                                                    Joined: Aug 30 05
                                                                                                    Posts: 62
                                                                                                    Credit: 510,419
                                                                                                    RAC: 0
                                                                                                    Message 2888 - Posted 3 Jun 2006 13:47:21 UTC - in response to Message 2887.

                                                                                                      Last modified: 3 Jun 2006 13:48:04 UTC

                                                                                                      You see ? SSE2 could help a bit :)

                                                                                                      BTW. Can P4 use FPU/SSE2 at the same time, so it will be faster than using SSE2 only ?

                                                                                                      I think you don\'t know too much about the float point engines of the processors.

                                                                                                      So, an optimized SSE2 code isn\'t faster than optimised FPU code on the current processors ( Athlon64, Pentium 4 and Pentium M clones ). Only the new CORE has real 128-bit wide SSE engine, so that will be faster in SSE2 with about 20% But! It will appear only with some parallel coded application.

                                                                                                      And your P4 can execute a combined FPU/SSE2 code but you would not be too happy when you see a big performance loss.

                                                                                                      Tom
                                                                                                      Send message
                                                                                                      Joined: May 26 06
                                                                                                      Posts: 5
                                                                                                      Credit: 44,277
                                                                                                      RAC: 0
                                                                                                      Message 2889 - Posted 3 Jun 2006 16:07:17 UTC - in response to Message 2888.

                                                                                                        Last modified: 3 Jun 2006 16:07:52 UTC

                                                                                                        ...Only the new CORE has real 128-bit wide SSE engine, so that will be faster in SSE2 with about 20% But! It will appear only with some parallel coded application...

                                                                                                        So SZTAKI won\'t be able to get that extra 20% even on those dual core Athlon64 processors?

                                                                                                        ____________

                                                                                                        akosf
                                                                                                        Avatar
                                                                                                        Send message
                                                                                                        Joined: Aug 30 05
                                                                                                        Posts: 62
                                                                                                        Credit: 510,419
                                                                                                        RAC: 0
                                                                                                        Message 2890 - Posted 3 Jun 2006 16:47:09 UTC - in response to Message 2889.

                                                                                                          Last modified: 3 Jun 2006 16:50:08 UTC

                                                                                                          ...Only the new CORE has real 128-bit wide SSE engine, so that will be faster in SSE2 with about 20% But! It will appear only with some parallel coded application...

                                                                                                          So SZTAKI won\'t be able to get that extra 20% even on those dual core Athlon64 processors?

                                                                                                          So, Athlon64.

                                                                                                          It has a 64-bit wide SSE engine (scheduler), that means the CPU decodes an SSE2 instructions (packed double precision) like as two macro-ops and the engine can schedules them with one clock cycle difference. You get the same timing if you use two FPU instructions instead SSE2. I could measure less than 2% speed improvement with SSE2 code. I think it come from the decoding methods ( FPU in memory, SSE2 in cpu ).

                                                                                                          The answer.
                                                                                                          A clever programming would be able to increase the difference between FPU and SSE2 in SZTAKI, probably with 20%. So, it is possible, but I cannot modify the current binary because it would take for long time.

                                                                                                          An other question.
                                                                                                          Why do you want a 20% (?) faster SSE2 compilation instead a more clever programmed code?
                                                                                                          My current code is about +814% faster and runs on my old computer (Duron) too.

                                                                                                          Profile Narwhal
                                                                                                          Avatar
                                                                                                          Send message
                                                                                                          Joined: Aug 25 05
                                                                                                          Posts: 32
                                                                                                          Credit: 688,804
                                                                                                          RAC: 0
                                                                                                          Message 2891 - Posted 3 Jun 2006 17:55:05 UTC

                                                                                                            http://eclient.tvn.hu/sdg11203.zip now has a problem with link, something wrong or just me?


                                                                                                            ____________

                                                                                                            hacki
                                                                                                            Send message
                                                                                                            Joined: May 31 06
                                                                                                            Posts: 1
                                                                                                            Credit: 639
                                                                                                            RAC: 0
                                                                                                            Message 2892 - Posted 3 Jun 2006 17:56:08 UTC

                                                                                                              I didn\'t can download the new version optimized of here http://eclient.tvn.hu/sdg11203.zip

                                                                                                              could you fixed that
                                                                                                              thx


                                                                                                              ____________

                                                                                                              Tom
                                                                                                              Send message
                                                                                                              Joined: May 26 06
                                                                                                              Posts: 5
                                                                                                              Credit: 44,277
                                                                                                              RAC: 0
                                                                                                              Message 2893 - Posted 3 Jun 2006 17:57:08 UTC - in response to Message 2890.

                                                                                                                Last modified: 3 Jun 2006 17:58:09 UTC

                                                                                                                ...An other question.
                                                                                                                Why do you want a 20% (?) faster SSE2 compilation instead a more clever programmed code?
                                                                                                                My current code is about +814% faster and runs on my old computer (Duron) too.

                                                                                                                No, I just misunderstood your statement as \'it would not work\' verses \'it\'s a minor performance enhancement compared to improving the algorithm\'. Thank you for the clarification.

                                                                                                                So, who do we ask to get this latest improvement running under Linux?

                                                                                                                ____________

                                                                                                                akosf
                                                                                                                Avatar
                                                                                                                Send message
                                                                                                                Joined: Aug 30 05
                                                                                                                Posts: 62
                                                                                                                Credit: 510,419
                                                                                                                RAC: 0
                                                                                                                Message 2894 - Posted 3 Jun 2006 19:52:42 UTC - in response to Message 2891.

                                                                                                                  http://eclient.tvn.hu/sdg11203.zip now has a problem with link, something wrong or just me?

                                                                                                                  The optimized files are removed because i got some \"zero credit\" with them, perhaps they have an error. Now is running a version that will capture these wus. I will be able test the optimized application after it. So, I will put back the files if they will be perfect.

                                                                                                                  Profile Rebirther
                                                                                                                  Avatar
                                                                                                                  Send message
                                                                                                                  Joined: Jul 12 05
                                                                                                                  Posts: 81
                                                                                                                  Credit: 15,472
                                                                                                                  RAC: 0
                                                                                                                  Message 2895 - Posted 3 Jun 2006 20:06:50 UTC - in response to Message 2894.

                                                                                                                    http://eclient.tvn.hu/sdg11203.zip now has a problem with link, something wrong or just me?

                                                                                                                    The optimized files are removed because i got some \"zero credit\" with them, perhaps they have an error. Now is running a version that will capture these wus. I will be able test the optimized application after it. So, I will put back the files if they will be perfect.


                                                                                                                    But why macs and other computers got also zero credits with standard app?

                                                                                                                    Profile UBT - Halifax--lad
                                                                                                                    Avatar
                                                                                                                    Send message
                                                                                                                    Joined: Sep 10 05
                                                                                                                    Posts: 126
                                                                                                                    Credit: 3,147
                                                                                                                    RAC: 0
                                                                                                                    Message 2896 - Posted 3 Jun 2006 20:20:11 UTC

                                                                                                                      I too think ti will be a SZTAKI problem and not your app as they already have 0 credit problems
                                                                                                                      ____________
                                                                                                                      Join us in Chat (see the forum) Click the Sig


                                                                                                                      Join UBT

                                                                                                                      akosf
                                                                                                                      Avatar
                                                                                                                      Send message
                                                                                                                      Joined: Aug 30 05
                                                                                                                      Posts: 62
                                                                                                                      Credit: 510,419
                                                                                                                      RAC: 0
                                                                                                                      Message 2897 - Posted 3 Jun 2006 21:07:46 UTC - in response to Message 2896.

                                                                                                                        I too think ti will be a SZTAKI problem and not your app as they already have 0 credit problems

                                                                                                                        How could you preclude the possibility of badly optimized application?

                                                                                                                        But why macs and other computers got also zero credits with standard app?

                                                                                                                        Sorry, i didn\'t examined the programs of macs. I don\'t know the reason of their problem.

                                                                                                                        Profile UBT - Halifax--lad
                                                                                                                        Avatar
                                                                                                                        Send message
                                                                                                                        Joined: Sep 10 05
                                                                                                                        Posts: 126
                                                                                                                        Credit: 3,147
                                                                                                                        RAC: 0
                                                                                                                        Message 2898 - Posted 3 Jun 2006 21:12:13 UTC

                                                                                                                          Because 0 credit WU\'s are already a know problem here, don\'t matter for me though got the optimised app in before it was taken offline
                                                                                                                          ____________
                                                                                                                          Join us in Chat (see the forum) Click the Sig


                                                                                                                          Join UBT

                                                                                                                          akosf
                                                                                                                          Avatar
                                                                                                                          Send message
                                                                                                                          Joined: Aug 30 05
                                                                                                                          Posts: 62
                                                                                                                          Credit: 510,419
                                                                                                                          RAC: 0
                                                                                                                          Message 2899 - Posted 3 Jun 2006 21:24:28 UTC - in response to Message 2898.

                                                                                                                            Last modified: 3 Jun 2006 21:45:05 UTC

                                                                                                                            Because 0 credit WU\'s are already a know problem here, don\'t matter for me though got the optimised app in before it was taken offline.

                                                                                                                            It is only a stupid chatter, i hope you know.

                                                                                                                            ckohler3
                                                                                                                            Send message
                                                                                                                            Joined: Aug 26 05
                                                                                                                            Posts: 14
                                                                                                                            Credit: 287,560
                                                                                                                            RAC: 0
                                                                                                                            Message 2900 - Posted 3 Jun 2006 23:13:03 UTC

                                                                                                                              Is sdg11202 also having problems with 0 credit granted? I would try it for you if you like.
                                                                                                                              ____________

                                                                                                                              akosf
                                                                                                                              Avatar
                                                                                                                              Send message
                                                                                                                              Joined: Aug 30 05
                                                                                                                              Posts: 62
                                                                                                                              Credit: 510,419
                                                                                                                              RAC: 0
                                                                                                                              Message 2901 - Posted 3 Jun 2006 23:50:39 UTC - in response to Message 2900.

                                                                                                                                Last modified: 3 Jun 2006 23:51:46 UTC

                                                                                                                                Is sdg11202 also having problems with 0 credit granted? I would try it for you if you like.

                                                                                                                                I found a fault in sdg11202 so that produced wrong results ( validator usually accepted these too, but they hadn\'t any scientific value )

                                                                                                                                I have found only a difficult test method so far.

                                                                                                                                - download some wus
                                                                                                                                - disbale network
                                                                                                                                - process wus
                                                                                                                                - analize result files ( simple text file, simple format )
                                                                                                                                - make a note of names of interesting results
                                                                                                                                - enable network ( send back results )
                                                                                                                                - wait for validation

                                                                                                                                App is probably good if the interesting results got credits.
                                                                                                                                ( more good results -> higher possibility in reliability of app )

                                                                                                                                Profile Nightbird
                                                                                                                                Forum moderator
                                                                                                                                Avatar
                                                                                                                                Send message
                                                                                                                                Joined: Jul 12 05
                                                                                                                                Posts: 920
                                                                                                                                Credit: 114,924
                                                                                                                                RAC: 0
                                                                                                                                Message 2902 - Posted 4 Jun 2006 0:46:32 UTC - in response to Message 2898.

                                                                                                                                  Last modified: 4 Jun 2006 1:34:23 UTC

                                                                                                                                  Because 0 credit WU\'s are already a know problem here, don\'t matter for me though got the optimised app in before it was taken offline

                                                                                                                                  Ehm, we have had problems with 0 credit wu\'s because the validator saw them like invalid (and it was right). But credits were always granted.
                                                                                                                                  Since the 1.12, things are really better.
                                                                                                                                  But if most wus are valid, indeed i saw invalid wus with the standard application.
                                                                                                                                  If you have doubts about (invalid) wus, feel free to open a new thread or to use this one feedback 1.12
                                                                                                                                  Thanks.

                                                                                                                                  ____________

                                                                                                                                  Honza
                                                                                                                                  Send message
                                                                                                                                  Joined: Aug 14 05
                                                                                                                                  Posts: 26
                                                                                                                                  Credit: 13,511
                                                                                                                                  RAC: 0
                                                                                                                                  Message 2905 - Posted 4 Jun 2006 8:36:47 UTC - in response to Message 2901.

                                                                                                                                    - disbale network
                                                                                                                                    I would like to follow this scenario.
                                                                                                                                    But due to a bug in 5.4.9, suspend network doesn\'t work and disabling OS network would be that last command I would be able to send over VNC remote desktop (:-

                                                                                                                                    This scenario would also enable us to really measure performance speedup of official and optimalized apps on re-run (and comparing results).
                                                                                                                                    Just run with opt. apps in the first round, revert to backup and run with official apps, compare results (note the speedup) and return results of official app.

                                                                                                                                    ____________

                                                                                                                                    Profile Rebirther
                                                                                                                                    Avatar
                                                                                                                                    Send message
                                                                                                                                    Joined: Jul 12 05
                                                                                                                                    Posts: 81
                                                                                                                                    Credit: 15,472
                                                                                                                                    RAC: 0
                                                                                                                                    Message 2906 - Posted 4 Jun 2006 9:23:07 UTC - in response to Message 2902.


                                                                                                                                      Ehm, we have had problems with 0 credit wu\'s because the validator saw them like invalid (and it was right). But credits were always granted.
                                                                                                                                      Since the 1.12, things are really better.
                                                                                                                                      But if most wus are valid, indeed i saw invalid wus with the standard application.


                                                                                                                                      Do you mean a script is running to validate these zero credits?

                                                                                                                                      Profile UBT - Halifax--lad
                                                                                                                                      Avatar
                                                                                                                                      Send message
                                                                                                                                      Joined: Sep 10 05
                                                                                                                                      Posts: 126
                                                                                                                                      Credit: 3,147
                                                                                                                                      RAC: 0
                                                                                                                                      Message 2907 - Posted 4 Jun 2006 10:13:43 UTC - in response to Message 2902.

                                                                                                                                        Because 0 credit WU\'s are already a know problem here, don\'t matter for me though got the optimised app in before it was taken offline

                                                                                                                                        Ehm, we have had problems with 0 credit wu\'s because the validator saw them like invalid (and it was right). But credits were always granted.
                                                                                                                                        Since the 1.12, things are really better.
                                                                                                                                        But if most wus are valid, indeed i saw invalid wus with the standard application.
                                                                                                                                        If you have doubts about (invalid) wus, feel free to open a new thread or to use this one feedback 1.12
                                                                                                                                        Thanks.


                                                                                                                                        Still waiting for my computer to come out of EDF, so I can try the optimised app thing, those CPDN WU\'s are playing havoc lately with my EDF! :D

                                                                                                                                        ____________
                                                                                                                                        Join us in Chat (see the forum) Click the Sig


                                                                                                                                        Join UBT

                                                                                                                                        akosf
                                                                                                                                        Avatar
                                                                                                                                        Send message
                                                                                                                                        Joined: Aug 30 05
                                                                                                                                        Posts: 62
                                                                                                                                        Credit: 510,419
                                                                                                                                        RAC: 0
                                                                                                                                        Message 2908 - Posted 4 Jun 2006 11:12:33 UTC

                                                                                                                                          So, i found that the official application is run out from the number range of the FPU. Probably, 64-bit wide float-point format (mantissa: 52bits + sign )isn\'t enough for the application. My last code uses 80-bit wide format (mantissa: 64bits + sign ) for the biggest number that is much better, so zero credit appears when the official app generates invalid numbers in the FPU.

                                                                                                                                          I will try to justify this assumption.

                                                                                                                                          akosf
                                                                                                                                          Avatar
                                                                                                                                          Send message
                                                                                                                                          Joined: Aug 30 05
                                                                                                                                          Posts: 62
                                                                                                                                          Credit: 510,419
                                                                                                                                          RAC: 0
                                                                                                                                          Message 2909 - Posted 4 Jun 2006 11:46:04 UTC - in response to Message 2908.

                                                                                                                                            So, i found that the official application is run out from the number range of the FPU. Probably, 64-bit wide float-point format (mantissa: 52bits + sign )isn\'t enough for the application. My last code uses 80-bit wide format (mantissa: 64bits + sign ) for the biggest number that is much better, so zero credit appears when the official app generates invalid numbers in the FPU.

                                                                                                                                            I will try to justify this assumption.

                                                                                                                                            Yep, I\'m right.
                                                                                                                                            7 minutes was enough to catch an invalid number (NaN) on FPU stack.

                                                                                                                                            Profile Nightbird
                                                                                                                                            Forum moderator
                                                                                                                                            Avatar
                                                                                                                                            Send message
                                                                                                                                            Joined: Jul 12 05
                                                                                                                                            Posts: 920
                                                                                                                                            Credit: 114,924
                                                                                                                                            RAC: 0
                                                                                                                                            Message 2910 - Posted 4 Jun 2006 12:12:22 UTC - in response to Message 2909.

                                                                                                                                              Last modified: 4 Jun 2006 12:36:03 UTC

                                                                                                                                              So, i found that the official application is run out from the number range of the FPU. Probably, 64-bit wide float-point format (mantissa: 52bits + sign )isn\'t enough for the application. My last code uses 80-bit wide format (mantissa: 64bits + sign ) for the biggest number that is much better, so zero credit appears when the official app generates invalid numbers in the FPU.

                                                                                                                                              I will try to justify this assumption.

                                                                                                                                              Yep, I\'m right.
                                                                                                                                              7 minutes was enough to catch an invalid number (NaN) on FPU stack.

                                                                                                                                              Incredible
                                                                                                                                              Ehm, what\'s the next step or surprize ? :) ;)

                                                                                                                                              ____________

                                                                                                                                              Profile Nightbird
                                                                                                                                              Forum moderator
                                                                                                                                              Avatar
                                                                                                                                              Send message
                                                                                                                                              Joined: Jul 12 05
                                                                                                                                              Posts: 920
                                                                                                                                              Credit: 114,924
                                                                                                                                              RAC: 0
                                                                                                                                              Message 2911 - Posted 4 Jun 2006 14:07:35 UTC - in response to Message 2907.


                                                                                                                                                Still waiting for my computer to come out of EDF, so I can try the optimised app thing, those CPDN WU\'s are playing havoc lately with my EDF! :D

                                                                                                                                                And you think that your computer will come out of EDF when ?

                                                                                                                                                ____________

                                                                                                                                                baracutio
                                                                                                                                                Send message
                                                                                                                                                Joined: Sep 4 05
                                                                                                                                                Posts: 5
                                                                                                                                                Credit: 3,103
                                                                                                                                                RAC: 0
                                                                                                                                                Message 2915 - Posted 5 Jun 2006 0:51:29 UTC

                                                                                                                                                  hi akos... when you could put back your optimized app? i would like to download it:)

                                                                                                                                                  and which projects are you going to optimize next?



                                                                                                                                                  mfg bara
                                                                                                                                                  ____________

                                                                                                                                                  akosf
                                                                                                                                                  Avatar
                                                                                                                                                  Send message
                                                                                                                                                  Joined: Aug 30 05
                                                                                                                                                  Posts: 62
                                                                                                                                                  Credit: 510,419
                                                                                                                                                  RAC: 0
                                                                                                                                                  Message 2916 - Posted 5 Jun 2006 5:56:46 UTC - in response to Message 2915.

                                                                                                                                                    Last modified: 5 Jun 2006 5:57:24 UTC

                                                                                                                                                    hi akos... when you could put back your optimized app? i would like to download it:)

                                                                                                                                                    You can download sdg11201.zip, because its results aren\'t different from the official app.

                                                                                                                                                    I\'m working on a SSE2 version, i hope that will produce similar results too.

                                                                                                                                                    and which projects are you going to optimize next?

                                                                                                                                                    I have lots of other plans, it is their turn.

                                                                                                                                                    Perhaps i will optimize SETI if its version will change no more.

                                                                                                                                                    Profile Nightbird
                                                                                                                                                    Forum moderator
                                                                                                                                                    Avatar
                                                                                                                                                    Send message
                                                                                                                                                    Joined: Jul 12 05
                                                                                                                                                    Posts: 920
                                                                                                                                                    Credit: 114,924
                                                                                                                                                    RAC: 0
                                                                                                                                                    Message 2918 - Posted 5 Jun 2006 7:41:16 UTC - in response to Message 2916.

                                                                                                                                                      Last modified: 5 Jun 2006 7:52:41 UTC

                                                                                                                                                      hi akos... when you could put back your optimized app? i would like to download it:)

                                                                                                                                                      You can download sdg11201.zip, because its results aren\'t different from the official app.

                                                                                                                                                      I\'m working on a SSE2 version, i hope that will produce similar results too.?


                                                                                                                                                      And what will happen if i install a SSE2 version on a only SSE cpu ?
                                                                                                                                                      Is it an other solution if you don\'t have a SS2 cpu ?

                                                                                                                                                      and which projects are you going to optimize next?
                                                                                                                                                      I have lots of other plans, it is their turn.


                                                                                                                                                      Perhaps i will optimize SETI if its version will change no more.


                                                                                                                                                      If you optimize Seti, then the users will be very happy, i guess... :)
                                                                                                                                                      The \"problem\" is the versions often have changed (3 in 1 month) because of the Enhanced.

                                                                                                                                                      Thanks for what you are doing and will do for the Boinc community. :)
                                                                                                                                                      ____________

                                                                                                                                                      akosf
                                                                                                                                                      Avatar
                                                                                                                                                      Send message
                                                                                                                                                      Joined: Aug 30 05
                                                                                                                                                      Posts: 62
                                                                                                                                                      Credit: 510,419
                                                                                                                                                      RAC: 0
                                                                                                                                                      Message 2919 - Posted 5 Jun 2006 8:32:39 UTC - in response to Message 2918.

                                                                                                                                                        Last modified: 5 Jun 2006 8:36:47 UTC

                                                                                                                                                        And what will happen if i install a SSE2 version on a only SSE cpu ?

                                                                                                                                                        in normal case:
                                                                                                                                                        - the processor will generate an exception
                                                                                                                                                        - it will get \"unhandled\" status because SZTAKI doesn\'t handle it
                                                                                                                                                        - the operation system give you and message and close the application
                                                                                                                                                        - you get an invalid wu with zero credit

                                                                                                                                                        in abnormal case (unstable operation system):
                                                                                                                                                        - many funny things (system restart or hang up)

                                                                                                                                                        Is it an other solution if you don\'t have a SS2 cpu ?

                                                                                                                                                        As far as i know only the Transmeta processors and some FPGA based cpu supports the \"instruction sets\" upgrades.
                                                                                                                                                        You cannot teach the others for SSE2.

                                                                                                                                                        Marky-UK
                                                                                                                                                        Send message
                                                                                                                                                        Joined: Dec 10 05
                                                                                                                                                        Posts: 6
                                                                                                                                                        Credit: 1,701
                                                                                                                                                        RAC: 0
                                                                                                                                                        Message 2920 - Posted 5 Jun 2006 8:40:50 UTC

                                                                                                                                                          What was \'wrong\' with the sdg11202 and sdg11203 versions in the end?

                                                                                                                                                          Were they occasionally producing invalid results? Or is it the standard application that does this and sdg11202/3 are producing good data which doesn\'t validate against the standard application\'s occasionally bad data?
                                                                                                                                                          ____________

                                                                                                                                                          akosf
                                                                                                                                                          Avatar
                                                                                                                                                          Send message
                                                                                                                                                          Joined: Aug 30 05
                                                                                                                                                          Posts: 62
                                                                                                                                                          Credit: 510,419
                                                                                                                                                          RAC: 0
                                                                                                                                                          Message 2921 - Posted 5 Jun 2006 9:38:11 UTC - in response to Message 2920.

                                                                                                                                                            Last modified: 5 Jun 2006 9:40:12 UTC

                                                                                                                                                            What was \'wrong\' with the sdg11202 and sdg11203 versions in the end?

                                                                                                                                                            Their results were different from the official app.

                                                                                                                                                            Were they occasionally producing invalid results? Or is it the standard application that does this and sdg11202/3 are producing good data which doesn\'t validate against the standard application\'s occasionally bad data?

                                                                                                                                                            I don\'t know.

                                                                                                                                                            I know these:
                                                                                                                                                            - sdg11201 produces perfectly similar results to official app
                                                                                                                                                            - i got significantly more \"zero credit\" with sdg11202 and sdg11203

                                                                                                                                                            Possible causes:
                                                                                                                                                            - validator problem ( but it doesn\'t explain significant difference in \"zero credits\" )
                                                                                                                                                            - official app produces bad results sometimes ( how can i check it? )
                                                                                                                                                            - sdg11202 and sdg11203 produce bad results sometimes

                                                                                                                                                            akosf
                                                                                                                                                            Avatar
                                                                                                                                                            Send message
                                                                                                                                                            Joined: Aug 30 05
                                                                                                                                                            Posts: 62
                                                                                                                                                            Credit: 510,419
                                                                                                                                                            RAC: 0
                                                                                                                                                            Message 2928 - Posted 5 Jun 2006 20:22:18 UTC

                                                                                                                                                              So, the SSE2 code is ready.

                                                                                                                                                              I launched this code on a computer to process up about 250 WUs and an other computer with the original app with the same WUs.
                                                                                                                                                              I think it will be a very good test, because i will be able to compare the results without the validator and i will be able to send back these WUs to check the validator too. :)

                                                                                                                                                              The first computer needs about 20 hours, the second one needs about... 10 days?!

                                                                                                                                                              Profile UBT - Halifax--lad
                                                                                                                                                              Avatar
                                                                                                                                                              Send message
                                                                                                                                                              Joined: Sep 10 05
                                                                                                                                                              Posts: 126
                                                                                                                                                              Credit: 3,147
                                                                                                                                                              RAC: 0
                                                                                                                                                              Message 2929 - Posted 5 Jun 2006 20:25:15 UTC - in response to Message 2911.


                                                                                                                                                                Still waiting for my computer to come out of EDF, so I can try the optimised app thing, those CPDN WU\'s are playing havoc lately with my EDF! :D

                                                                                                                                                                And you think that your computer will come out of EDF when ?


                                                                                                                                                                I really dont know on that one I am begining to think something is wrong, I am monitoring it at the moment both on LTD & STD
                                                                                                                                                                ____________
                                                                                                                                                                Join us in Chat (see the forum) Click the Sig


                                                                                                                                                                Join UBT

                                                                                                                                                                vonHalenbach
                                                                                                                                                                Send message
                                                                                                                                                                Joined: May 31 06
                                                                                                                                                                Posts: 25
                                                                                                                                                                Credit: 461
                                                                                                                                                                RAC: 0
                                                                                                                                                                Message 2931 - Posted 5 Jun 2006 20:50:01 UTC - in response to Message 2928.

                                                                                                                                                                  Last modified: 5 Jun 2006 20:55:46 UTC


                                                                                                                                                                  The first computer needs about 20 hours, the second one needs about... 10 days?!


                                                                                                                                                                  lol
                                                                                                                                                                  I suspected there would be a drawback. ;-)

                                                                                                                                                                  So. I will wait if that helps to tackle that problem.
                                                                                                                                                                  Akos do you think that Bernd Machenschalk would be willing to integrate your ideas into the sztaki-client? That would be great, because i use only linux.
                                                                                                                                                                  Greetings
                                                                                                                                                                  vonHalenbach
                                                                                                                                                                  ____________

                                                                                                                                                                  ckohler3
                                                                                                                                                                  Send message
                                                                                                                                                                  Joined: Aug 26 05
                                                                                                                                                                  Posts: 14
                                                                                                                                                                  Credit: 287,560
                                                                                                                                                                  RAC: 0
                                                                                                                                                                  Message 2932 - Posted 5 Jun 2006 20:51:36 UTC

                                                                                                                                                                    sdg11201 is working great for now.
                                                                                                                                                                    thanks so much for your efforts akosf
                                                                                                                                                                    ____________

                                                                                                                                                                    Profile Ananas
                                                                                                                                                                    Send message
                                                                                                                                                                    Joined: Jul 12 05
                                                                                                                                                                    Posts: 222
                                                                                                                                                                    Credit: 665,833
                                                                                                                                                                    RAC: 0
                                                                                                                                                                    Message 2934 - Posted 5 Jun 2006 21:43:47 UTC

                                                                                                                                                                      sdg11202 seems to work well on P4 and PM Banias but have invalid results now and then on Athlon XP.

                                                                                                                                                                      This seems a little strange to me as the basic instruction set without extensions should be identical.

                                                                                                                                                                      Profile Nightbird
                                                                                                                                                                      Forum moderator
                                                                                                                                                                      Avatar
                                                                                                                                                                      Send message
                                                                                                                                                                      Joined: Jul 12 05
                                                                                                                                                                      Posts: 920
                                                                                                                                                                      Credit: 114,924
                                                                                                                                                                      RAC: 0
                                                                                                                                                                      Message 2935 - Posted 5 Jun 2006 21:55:12 UTC - in response to Message 2931.

                                                                                                                                                                        Akos do you think that Bernd Machenschalk would be willing to integrate your ideas into the sztaki-client? That would be great, because i use only linux.
                                                                                                                                                                        Greetings
                                                                                                                                                                        vonHalenbach

                                                                                                                                                                        Frankly, i think that\'s a job for Adam.


                                                                                                                                                                        ____________

                                                                                                                                                                        Profile paul and kirsty yates
                                                                                                                                                                        Avatar
                                                                                                                                                                        Send message
                                                                                                                                                                        Joined: Jan 1 06
                                                                                                                                                                        Posts: 11
                                                                                                                                                                        Credit: 811
                                                                                                                                                                        RAC: 0
                                                                                                                                                                        Message 2936 - Posted 5 Jun 2006 21:58:11 UTC - in response to Message 2856.

                                                                                                                                                                          Last modified: 5 Jun 2006 21:58:59 UTC

                                                                                                                                                                          Only SSE2 or SSE3 compilation would be good for SZTAKI, and i think a simple native compilation would be able to improve the speed only with 0-3%. Not more.
                                                                                                                                                                          MMX and SSE aren\'t good for SZTAKI. But i did an other very-very small modification ( about a newer 10 minutes work ) and it improved the speed with about 70%.



                                                                                                                                                                          sorry to do dumb but does this mean i can/cannot run sdg11202 on my 3dnow+ and mmx+ chip ???


                                                                                                                                                                          ____________

                                                                                                                                                                          Profile Nightbird
                                                                                                                                                                          Forum moderator
                                                                                                                                                                          Avatar
                                                                                                                                                                          Send message
                                                                                                                                                                          Joined: Jul 12 05
                                                                                                                                                                          Posts: 920
                                                                                                                                                                          Credit: 114,924
                                                                                                                                                                          RAC: 0
                                                                                                                                                                          Message 2937 - Posted 5 Jun 2006 22:20:55 UTC - in response to Message 2936.

                                                                                                                                                                            Last modified: 5 Jun 2006 22:26:32 UTC

                                                                                                                                                                            Only SSE2 or SSE3 compilation would be good for SZTAKI, and i think a simple native compilation would be able to improve the speed only with 0-3%. Not more.
                                                                                                                                                                            MMX and SSE aren\'t good for SZTAKI. But i did an other very-very small modification ( about a newer 10 minutes work ) and it improved the speed with about 70%.

                                                                                                                                                                            sorry to do dumb but does this mean i can/cannot run sdg11202 on my 3dnow+ and mmx+ chip ???


                                                                                                                                                                            I\'m running sdg11202 on my Barton\'s (which only support Mmx, 3dnow and SSE)
                                                                                                                                                                            But with the sdg11202, you can increase your \"0 credits\" (invalid wu)

                                                                                                                                                                            So the best is to use the sdg11201.

                                                                                                                                                                            According AkosF, \"these applications don\'t use new instructions so they have to run on 386 too.\"

                                                                                                                                                                            But the next application will be optimized for SSE2 and will not be compatible with your cpus (and mine too ;)).
                                                                                                                                                                            ____________

                                                                                                                                                                            Rakarin
                                                                                                                                                                            Avatar
                                                                                                                                                                            Send message
                                                                                                                                                                            Joined: Feb 4 06
                                                                                                                                                                            Posts: 17
                                                                                                                                                                            Credit: 46,513
                                                                                                                                                                            RAC: 0
                                                                                                                                                                            Message 2940 - Posted 6 Jun 2006 0:17:30 UTC - in response to Message 2937.


                                                                                                                                                                              I\'m running sdg11202 on my Barton\'s (which only support Mmx, 3dnow and SSE)
                                                                                                                                                                              But with the sdg11202, you can increase your \"0 credits\" (invalid wu)

                                                                                                                                                                              So the best is to use the sdg11201.

                                                                                                                                                                              According AkosF, \"these applications don\'t use new instructions so they have to run on 386 too.\"

                                                                                                                                                                              But the next application will be optimized for SSE2 and will not be compatible with your cpus (and mine too ;)).


                                                                                                                                                                              I had to switch back to 11201 because I was also getting 0-credit work units. (Running on an AMD Athlon XP 3200+, Barton core.) That was a bit disappointing. With version ~03, the work units were completing in 8-9 minutes. Eight or nine minutes!!! That was amazing!!! I can only imagine the times on newer processors.

                                                                                                                                                                              Unfortunately, having recently purchased a new G5 PPC Mac (which runs BOINC on one of the cores, F@H on the other), I can\'t justify a new PC \"to make BOINC run faster\"....

                                                                                                                                                                              If you look at a processor breakdown on some of the BOINC statistic sites, there are a lot of people running BOINC on older PC\'s. I leave my Linux box on (except in summer) to run BOINC, even though I don\'t use it as often. While tapping the new optimized instruction sets on new processors can have a tremendous advantage, excluding everyone else can be an equal loss.


                                                                                                                                                                              ____________

                                                                                                                                                                              Profile Nightbird
                                                                                                                                                                              Forum moderator
                                                                                                                                                                              Avatar
                                                                                                                                                                              Send message
                                                                                                                                                                              Joined: Jul 12 05
                                                                                                                                                                              Posts: 920
                                                                                                                                                                              Credit: 114,924
                                                                                                                                                                              RAC: 0
                                                                                                                                                                              Message 2941 - Posted 6 Jun 2006 0:55:50 UTC

                                                                                                                                                                                Last modified: 6 Jun 2006 1:56:39 UTC

                                                                                                                                                                                I\'m running too a Barton 3200+
                                                                                                                                                                                Times with the sdg 11202 ? between 6 min and 12 min, depending the wus.
                                                                                                                                                                                I got 1 wu with 0-credit but with all the wus pending, let\'s wait a little ...

                                                                                                                                                                                I\'m running also an Athlon64 3200+
                                                                                                                                                                                Times : between 6 and 17 min ; more amplitude

                                                                                                                                                                                My Barton 2500+ seems to be \"the more sensitive\" to the 0-credit.
                                                                                                                                                                                I will switch back to the sdg 11201 to see.

                                                                                                                                                                                Anyway AkosF\'s application is pretty impressive since it doesn\'t use newest instructions.

                                                                                                                                                                                edit : no machine is overclocked

                                                                                                                                                                                ____________

                                                                                                                                                                                Profile Nightbird
                                                                                                                                                                                Forum moderator
                                                                                                                                                                                Avatar
                                                                                                                                                                                Send message
                                                                                                                                                                                Joined: Jul 12 05
                                                                                                                                                                                Posts: 920
                                                                                                                                                                                Credit: 114,924
                                                                                                                                                                                RAC: 0
                                                                                                                                                                                Message 2942 - Posted 6 Jun 2006 1:12:00 UTC - in response to Message 2928.

                                                                                                                                                                                  Last modified: 6 Jun 2006 1:15:48 UTC

                                                                                                                                                                                  So, the SSE2 code is ready.

                                                                                                                                                                                  I launched this code on a computer to process up about 250 WUs and an other computer with the original app with the same WUs.
                                                                                                                                                                                  I think it will be a very good test, because i will be able to compare the results without the validator and i will be able to send back these WUs to check the validator too. :)

                                                                                                                                                                                  The first computer needs about 20 hours, the second one needs about... 10 days?!

                                                                                                                                                                                  You\'re running your (first) wus in mode \"turbo\" ! ;)

                                                                                                                                                                                  ____________

                                                                                                                                                                                  Profile groundhog
                                                                                                                                                                                  Send message
                                                                                                                                                                                  Joined: Jun 3 06
                                                                                                                                                                                  Posts: 1
                                                                                                                                                                                  Credit: 620
                                                                                                                                                                                  RAC: 0
                                                                                                                                                                                  Message 2943 - Posted 6 Jun 2006 1:40:29 UTC - in response to Message 2921.


                                                                                                                                                                                    I know these:
                                                                                                                                                                                    - sdg11201 produces perfectly similar results to official app
                                                                                                                                                                                    - i got significantly more \"zero credit\" with sdg11202 and sdg11203



                                                                                                                                                                                    Hello Akos, all,

                                                                                                                                                                                    I only joined SZTAKI recently after reading about an Akos-optimized client on iirc the Einstein-board. I have never used the original client, only the latest optimized one, on two machines (Athlon XP2800, PIII-866).

                                                                                                                                                                                    I got credit for every WU that left the \"pending\"-state, no \"zero-credit\" ones on my list:
                                                                                                                                                                                    http://szdg.lpds.sztaki.hu/szdg/results.php?userid=9182

                                                                                                                                                                                    No overclocking on my side.

                                                                                                                                                                                    Thanks for your work on speeding up the science by optimizing code at several projects.

                                                                                                                                                                                    Greetings Groundhog
                                                                                                                                                                                    ____________

                                                                                                                                                                                    Tom
                                                                                                                                                                                    Send message
                                                                                                                                                                                    Joined: May 26 06
                                                                                                                                                                                    Posts: 5
                                                                                                                                                                                    Credit: 44,277
                                                                                                                                                                                    RAC: 0
                                                                                                                                                                                    Message 2945 - Posted 6 Jun 2006 2:36:36 UTC - in response to Message 2928.

                                                                                                                                                                                      So, the SSE2 code is ready...The first computer needs about 20 hours, the second one needs about... 10 days?!

                                                                                                                                                                                      Have you considered and/or is it possible to establish a sztaki alpha/beta project where \'we\' can help you during your development? Might cut down on that 10 day lag.
                                                                                                                                                                                      ____________

                                                                                                                                                                                      akosf
                                                                                                                                                                                      Avatar
                                                                                                                                                                                      Send message
                                                                                                                                                                                      Joined: Aug 30 05
                                                                                                                                                                                      Posts: 62
                                                                                                                                                                                      Credit: 510,419
                                                                                                                                                                                      RAC: 0
                                                                                                                                                                                      Message 2946 - Posted 6 Jun 2006 6:27:36 UTC - in response to Message 2945.

                                                                                                                                                                                        So, the SSE2 code is ready...The first computer needs about 20 hours, the second one needs about... 10 days?!

                                                                                                                                                                                        Have you considered and/or is it possible to establish a sztaki alpha/beta project where \'we\' can help you during your development? Might cut down on that 10 day lag.

                                                                                                                                                                                        I cut down it to about 5 days.

                                                                                                                                                                                        Profile paul and kirsty yates
                                                                                                                                                                                        Avatar
                                                                                                                                                                                        Send message
                                                                                                                                                                                        Joined: Jan 1 06
                                                                                                                                                                                        Posts: 11
                                                                                                                                                                                        Credit: 811
                                                                                                                                                                                        RAC: 0
                                                                                                                                                                                        Message 2947 - Posted 6 Jun 2006 6:28:58 UTC - in response to Message 2937.


                                                                                                                                                                                          [/quote]
                                                                                                                                                                                          I\'m running sdg11202 on my Barton\'s (which only support Mmx, 3dnow and SSE)
                                                                                                                                                                                          But with the sdg11202, you can increase your \"0 credits\" (invalid wu)

                                                                                                                                                                                          So the best is to use the sdg11201.

                                                                                                                                                                                          According AkosF, \"these applications don\'t use new instructions so they have to run on 386 too.\"

                                                                                                                                                                                          But the next application will be optimized for SSE2 and will not be compatible with your cpus (and mine too ;)).[/quote]


                                                                                                                                                                                          thanks for the replys everyone i`ll give it a go
                                                                                                                                                                                          ____________

                                                                                                                                                                                          akosf
                                                                                                                                                                                          Avatar
                                                                                                                                                                                          Send message
                                                                                                                                                                                          Joined: Aug 30 05
                                                                                                                                                                                          Posts: 62
                                                                                                                                                                                          Credit: 510,419
                                                                                                                                                                                          RAC: 0
                                                                                                                                                                                          Message 2954 - Posted 6 Jun 2006 18:32:15 UTC

                                                                                                                                                                                            I found some differences between SSE2 and the official app, so it will need some corrections.

                                                                                                                                                                                            akosf
                                                                                                                                                                                            Avatar
                                                                                                                                                                                            Send message
                                                                                                                                                                                            Joined: Aug 30 05
                                                                                                                                                                                            Posts: 62
                                                                                                                                                                                            Credit: 510,419
                                                                                                                                                                                            RAC: 0
                                                                                                                                                                                            Message 2957 - Posted 6 Jun 2006 20:26:54 UTC - in response to Message 2954.

                                                                                                                                                                                              Last modified: 6 Jun 2006 20:35:02 UTC

                                                                                                                                                                                              I found some differences between SSE2 and the official app, so it will need some corrections.

                                                                                                                                                                                              So, status report after 25 WU.

                                                                                                                                                                                              The official app found 7 \"interesting\" WUs from 25 WUs. (~28%)

                                                                                                                                                                                              The SSE2 app found 2 \"interesting\" WUs from 251 WUs. (~0,8%)
                                                                                                                                                                                              SSE2 was developed to generate the same trucations like official.

                                                                                                                                                                                              I checked these WUs with the most precise SDG11203.
                                                                                                                                                                                              SDG11203 didn\'t found any \"interesting\" WUs from them. (less than 0,4%)
                                                                                                                                                                                              I think this is correct!
                                                                                                                                                                                              There are have to be very very less useful combination! (near to 0,0%)


                                                                                                                                                                                              Second step:
                                                                                                                                                                                              I did a special app that consisted both methods, the original and the SSE2 code, and they checked each other in every run. I found some differences and I examined them.

                                                                                                                                                                                              The official app is not right for its task.

                                                                                                                                                                                              It usually generated \"NaN\" values that means the CPU cannot interpret these number, but the program just do comparisions, multiplications with them. So, the results are perfectly wrong. They are based on faulty datas.

                                                                                                                                                                                              edit: just a question:
                                                                                                                                                                                              I found these faults not only in last multiplication phase, so
                                                                                                                                                                                              the previous results (10th dimension) were good?

                                                                                                                                                                                              Profile Nightbird
                                                                                                                                                                                              Forum moderator
                                                                                                                                                                                              Avatar
                                                                                                                                                                                              Send message
                                                                                                                                                                                              Joined: Jul 12 05
                                                                                                                                                                                              Posts: 920
                                                                                                                                                                                              Credit: 114,924
                                                                                                                                                                                              RAC: 0
                                                                                                                                                                                              Message 2958 - Posted 6 Jun 2006 20:32:19 UTC

                                                                                                                                                                                                Last modified: 6 Jun 2006 20:42:14 UTC

                                                                                                                                                                                                According you, what would be needeed to get an \"official application\" \"right for the task\" ?
                                                                                                                                                                                                ____________

                                                                                                                                                                                                Profile Rebirther
                                                                                                                                                                                                Avatar
                                                                                                                                                                                                Send message
                                                                                                                                                                                                Joined: Jul 12 05
                                                                                                                                                                                                Posts: 81
                                                                                                                                                                                                Credit: 15,472
                                                                                                                                                                                                RAC: 0
                                                                                                                                                                                                Message 2959 - Posted 6 Jun 2006 20:40:58 UTC

                                                                                                                                                                                                  Ohh, that sounds not good!

                                                                                                                                                                                                  LiborA
                                                                                                                                                                                                  Send message
                                                                                                                                                                                                  Joined: Mar 17 06
                                                                                                                                                                                                  Posts: 25
                                                                                                                                                                                                  Credit: 12,902
                                                                                                                                                                                                  RAC: 0
                                                                                                                                                                                                  Message 2962 - Posted 6 Jun 2006 21:07:14 UTC

                                                                                                                                                                                                    Thats not good message, now I expect very fast reaction from project side
                                                                                                                                                                                                    ____________

                                                                                                                                                                                                    Profile paul and kirsty yates
                                                                                                                                                                                                    Avatar
                                                                                                                                                                                                    Send message
                                                                                                                                                                                                    Joined: Jan 1 06
                                                                                                                                                                                                    Posts: 11
                                                                                                                                                                                                    Credit: 811
                                                                                                                                                                                                    RAC: 0
                                                                                                                                                                                                    Message 2963 - Posted 6 Jun 2006 21:16:14 UTC

                                                                                                                                                                                                      Last modified: 6 Jun 2006 21:18:01 UTC

                                                                                                                                                                                                      \"The official app is not right for its task.\"


                                                                                                                                                                                                      oh eck there could be trouble at mill !!!


                                                                                                                                                                                                      sorry if you dont understand
                                                                                                                                                                                                      it was a saying in the coal mines of england




                                                                                                                                                                                                      ____________

                                                                                                                                                                                                      akosf
                                                                                                                                                                                                      Avatar
                                                                                                                                                                                                      Send message
                                                                                                                                                                                                      Joined: Aug 30 05
                                                                                                                                                                                                      Posts: 62
                                                                                                                                                                                                      Credit: 510,419
                                                                                                                                                                                                      RAC: 0
                                                                                                                                                                                                      Message 2964 - Posted 6 Jun 2006 21:33:54 UTC - in response to Message 2958.

                                                                                                                                                                                                        According you, what would be needeed to get an \"official application\" \"right for the task\" ?

                                                                                                                                                                                                        I don\'t know the science of the base of this application but the input datas are easily clear.

                                                                                                                                                                                                        If you open a WU you will see like this (this is a range):
                                                                                                                                                                                                        2 -4 9 -26 -40 -40 -40 -40 -40 -11 4 -2 1 : 2 -4 9 -26 40 40 40 40 40 -11 4 -2 1

                                                                                                                                                                                                        The FPU would be enough if this range would be a bit narrower, like this:
                                                                                                                                                                                                        -2..2 -4..4 -9..9 -16..16 -25..25 -36..36 -49..49 -36..36 -25..25 -16..16 -9..9 -4..4 -2..2

                                                                                                                                                                                                        The problem is that the current calculation methond needs at least 13 or 14 bit size exponent. SSE2 and the used FPU method support only 12 bit format, but the FPU would be able to handle 15 bits ( extended precision (80bit) ).
                                                                                                                                                                                                        The only disadvantage of extended precision is its very slow memory movements ( about 5 times slower than double precision (64bit) ), but a celver programming would be able to eliminate it! Like SDG11203, because this code takes the big numbers in the 80bit size FPU registers, so it doesn\'t need slow memory movements ( load-store ).

                                                                                                                                                                                                        baracutio
                                                                                                                                                                                                        Send message
                                                                                                                                                                                                        Joined: Sep 4 05
                                                                                                                                                                                                        Posts: 5
                                                                                                                                                                                                        Credit: 3,103
                                                                                                                                                                                                        RAC: 0
                                                                                                                                                                                                        Message 2965 - Posted 6 Jun 2006 23:51:30 UTC - in response to Message 2964.

                                                                                                                                                                                                          According you, what would be needeed to get an \"official application\" \"right for the task\" ?

                                                                                                                                                                                                          I don\'t know the science of the base of this application but the input datas are easily clear.

                                                                                                                                                                                                          If you open a WU you will see like this (this is a range):
                                                                                                                                                                                                          2 -4 9 -26 -40 -40 -40 -40 -40 -11 4 -2 1 : 2 -4 9 -26 40 40 40 40 40 -11 4 -2 1

                                                                                                                                                                                                          The FPU would be enough if this range would be a bit narrower, like this:
                                                                                                                                                                                                          -2..2 -4..4 -9..9 -16..16 -25..25 -36..36 -49..49 -36..36 -25..25 -16..16 -9..9 -4..4 -2..2

                                                                                                                                                                                                          The problem is that the current calculation methond needs at least 13 or 14 bit size exponent. SSE2 and the used FPU method support only 12 bit format, but the FPU would be able to handle 15 bits ( extended precision (80bit) ).
                                                                                                                                                                                                          The only disadvantage of extended precision is its very slow memory movements ( about 5 times slower than double precision (64bit) ), but a celver programming would be able to eliminate it! Like SDG11203, because this code takes the big numbers in the 80bit size FPU registers, so it doesn\'t need slow memory movements ( load-store ).


                                                                                                                                                                                                          Hmm... sounds difficult:) this means, that SDG11203 could be the next \"official\" app, right? correct me if i\'m wrong...



                                                                                                                                                                                                          mfg bara

                                                                                                                                                                                                          akosf
                                                                                                                                                                                                          Avatar
                                                                                                                                                                                                          Send message
                                                                                                                                                                                                          Joined: Aug 30 05
                                                                                                                                                                                                          Posts: 62
                                                                                                                                                                                                          Credit: 510,419
                                                                                                                                                                                                          RAC: 0
                                                                                                                                                                                                          Message 2966 - Posted 7 Jun 2006 6:12:25 UTC - in response to Message 2965.

                                                                                                                                                                                                            Hmm... sounds difficult:) this means, that SDG11203 could be the next \"official\" app, right? correct me if i\'m wrong...

                                                                                                                                                                                                            No... SDG11203 is only a modification, and not from SZTAKI.
                                                                                                                                                                                                            I think SZTAKI will correct this problem with a new application.

                                                                                                                                                                                                            Profile Nightbird
                                                                                                                                                                                                            Forum moderator
                                                                                                                                                                                                            Avatar
                                                                                                                                                                                                            Send message
                                                                                                                                                                                                            Joined: Jul 12 05
                                                                                                                                                                                                            Posts: 920
                                                                                                                                                                                                            Credit: 114,924
                                                                                                                                                                                                            RAC: 0
                                                                                                                                                                                                            Message 2968 - Posted 7 Jun 2006 7:58:28 UTC

                                                                                                                                                                                                              I sent an email to Adam and Attila.
                                                                                                                                                                                                              Questions need answers.
                                                                                                                                                                                                              ____________

                                                                                                                                                                                                              Profile kadam
                                                                                                                                                                                                              Project administrator
                                                                                                                                                                                                              Avatar
                                                                                                                                                                                                              Send message
                                                                                                                                                                                                              Joined: May 25 05
                                                                                                                                                                                                              Posts: 589
                                                                                                                                                                                                              Credit: 38,614
                                                                                                                                                                                                              RAC: 0
                                                                                                                                                                                                              Message 2969 - Posted 7 Jun 2006 8:19:18 UTC - in response to Message 2968.

                                                                                                                                                                                                                The mathematician colleauges haven\'t replied yet for this question. In the meanwhile we are testing the new algorithm, which hopefully will faciliate the necessary precision, if it is provenly needed.

                                                                                                                                                                                                                Although I know that Akos is really trying to help the project, but as a system administrator, liable for thousands of computers, I have to state that we can\'t take any responsibility for applications downloaded from outside the project.

                                                                                                                                                                                                                More info soon...
                                                                                                                                                                                                                ____________
                                                                                                                                                                                                                If you like BOINC, you may also find CaretCursor to be appealing.

                                                                                                                                                                                                                Profile UBT - Halifax--lad
                                                                                                                                                                                                                Avatar
                                                                                                                                                                                                                Send message
                                                                                                                                                                                                                Joined: Sep 10 05
                                                                                                                                                                                                                Posts: 126
                                                                                                                                                                                                                Credit: 3,147
                                                                                                                                                                                                                RAC: 0
                                                                                                                                                                                                                Message 2970 - Posted 7 Jun 2006 8:24:50 UTC - in response to Message 2969.

                                                                                                                                                                                                                  Although I know that Akos is really trying to help the project, but as a system administrator, liable for thousands of computers, I have to state that we can\'t take any responsibility for applications downloaded from outside the project.


                                                                                                                                                                                                                  State the obvious we as users know that this app is not official, so we take our own responsibility for using the app.

                                                                                                                                                                                                                  ____________
                                                                                                                                                                                                                  Join us in Chat (see the forum) Click the Sig


                                                                                                                                                                                                                  Join UBT

                                                                                                                                                                                                                  Profile kadam
                                                                                                                                                                                                                  Project administrator
                                                                                                                                                                                                                  Avatar
                                                                                                                                                                                                                  Send message
                                                                                                                                                                                                                  Joined: May 25 05
                                                                                                                                                                                                                  Posts: 589
                                                                                                                                                                                                                  Credit: 38,614
                                                                                                                                                                                                                  RAC: 0
                                                                                                                                                                                                                  Message 2971 - Posted 7 Jun 2006 9:56:54 UTC - in response to Message 2969.

                                                                                                                                                                                                                    Here is the answer from our mathematician colleagues:

                                                                                                                                                                                                                    Akos is partly right, that the current algorithm can run out of the 12 bits of the current number representation method. However, as Akos has also mentioned, it results in a faster algorithm, which is true that it allows some faster \"miscalculation\". But in the big view it is still faster to quickly sort out the incorrect polinoms later on the server, than to run a fully precise but extremly slow algorithm.

                                                                                                                                                                                                                    As a final solution to the problem, the new algorithm is finally in the testing phase and it faciliates a higher number precision...
                                                                                                                                                                                                                    ____________
                                                                                                                                                                                                                    If you like BOINC, you may also find CaretCursor to be appealing.

                                                                                                                                                                                                                    vonHalenbach
                                                                                                                                                                                                                    Send message
                                                                                                                                                                                                                    Joined: May 31 06
                                                                                                                                                                                                                    Posts: 25
                                                                                                                                                                                                                    Credit: 461
                                                                                                                                                                                                                    RAC: 0
                                                                                                                                                                                                                    Message 2972 - Posted 7 Jun 2006 10:10:28 UTC - in response to Message 2964.

                                                                                                                                                                                                                      Last modified: 7 Jun 2006 10:23:14 UTC



                                                                                                                                                                                                                      If you open a WU you will see like this (this is a range):
                                                                                                                                                                                                                      2 -4 9 -26 -40 -40 -40 -40 -40 -11 4 -2 1 : 2 -4 9 -26 40 40 40 40 40 -11 4 -2 1

                                                                                                                                                                                                                      The FPU would be enough if this range would be a bit narrower, like this:
                                                                                                                                                                                                                      -2..2 -4..4 -9..9 -16..16 -25..25 -36..36 -49..49 -36..36 -25..25 -16..16 -9..9 -4..4 -2..2

                                                                                                                                                                                                                      The problem is that the current calculation methond needs at least 13 or 14 bit size exponent. SSE2 and the used FPU method support only 12 bit format, but the FPU would be able to handle 15 bits ( extended precision (80bit) ).
                                                                                                                                                                                                                      The only disadvantage of extended precision is its very slow memory movements ( about 5 times slower than double precision (64bit) ), but a celver programming would be able to eliminate it! Like SDG11203, because this code takes the big numbers in the 80bit size FPU registers, so it doesn\'t need slow memory movements ( load-store ).


                                                                                                                                                                                                                      I am no mathematician.
                                                                                                                                                                                                                      This is a randomly chosen range, that popped up from millions of WU\'s because the bitsize of the CPU is too small. Is it possible that there are ranges to compute that even exceed a 80 bit range? Do we need then a completly new cpu to calculate that range or can this be cut in little pieces and after computing added to a big result?
                                                                                                                                                                                                                      We were probably talking about very big numbers here.
                                                                                                                                                                                                                      It is just important to know exactly which WU is computabel and which is not. Those WUs that make problems have to be stored away for later analysing with better computers than ours.

                                                                                                                                                                                                                      As a final solution to the problem, the new algorithm is finally in the testing phase and it faciliates a higher number precision...


                                                                                                                                                                                                                      And this new algorithm should stop processing immediately when NaN-problems pop up rather then calculating this toobigtocalculate range to the end. That saves time!

                                                                                                                                                                                                                      Greetings
                                                                                                                                                                                                                      vonHalenbach
                                                                                                                                                                                                                      ____________

                                                                                                                                                                                                                      akosf
                                                                                                                                                                                                                      Avatar
                                                                                                                                                                                                                      Send message
                                                                                                                                                                                                                      Joined: Aug 30 05
                                                                                                                                                                                                                      Posts: 62
                                                                                                                                                                                                                      Credit: 510,419
                                                                                                                                                                                                                      RAC: 0
                                                                                                                                                                                                                      Message 2973 - Posted 7 Jun 2006 10:10:51 UTC - in response to Message 2971.

                                                                                                                                                                                                                        Last modified: 7 Jun 2006 10:19:21 UTC

                                                                                                                                                                                                                        Hi Adam!

                                                                                                                                                                                                                        Thanks for the fast answer!

                                                                                                                                                                                                                        Akos is partly right, that the current algorithm can run out of the 12 bits of the current number representation method. However, as Akos has also mentioned, it results in a faster algorithm, which is true that it allows some faster \"miscalculation\". But in the big view it is still faster to quickly sort out the incorrect polinoms later on the server, than to run a fully precise but extremly slow algorithm.

                                                                                                                                                                                                                        This sorter algorithm can drop out the bad polinoms, but sometimes the application can\'t find some good variations because of these miscalculations. So some good combinations stay hidden.
                                                                                                                                                                                                                        Is it not problem?

                                                                                                                                                                                                                        As a final solution to the problem, the new algorithm is finally in the testing phase and it faciliates a higher number precision...

                                                                                                                                                                                                                        I\'m waiting for that.

                                                                                                                                                                                                                        Post to thread

                                                                                                                                                                                                                        Message boards : SZTAKI Desktop Grid : ATTENTION: SSE2 compilation od SZTAKI ?!


                                                                                                                                                                                                                        Home | My Account | Message Boards


                                                                                                                                                                                                                        Copyright © 2017 SZTAKI Desktop Grid