Long WUs


Advanced search

Message boards : Észrevételek, tapasztalatok : Long WUs

AuthorMessage
Profile [B^S] thierry@home
Avatar
Send message
Joined: Jul 12 05
Posts: 98
Credit: 255,646
RAC: 0
Message 1945 - Posted 23 Feb 2006 20:23:59 UTC

    I have a WU that is at 11% after 3 hours..... This is unusual after all those short ones. Do I worry about that?

    ____________

    Profile Nightbird
    Forum moderator
    Avatar
    Send message
    Joined: Jul 12 05
    Posts: 920
    Credit: 114,924
    RAC: 0
    Message 1946 - Posted 23 Feb 2006 20:33:02 UTC

      Last modified: 23 Feb 2006 20:40:24 UTC

      I would say that we are in a period of transition (waiting the new version of the application)
      http://szdg.lpds.sztaki.hu/szdg/forum_thread.php?id=42#1937
      http://szdg.lpds.sztaki.hu/szdg/forum_thread.php?id=231

      The current deadline can be a problem for now if your cache is too big.
      ____________

      Profile [B^S] thierry@home
      Avatar
      Send message
      Joined: Jul 12 05
      Posts: 98
      Credit: 255,646
      RAC: 0
      Message 1947 - Posted 23 Feb 2006 20:45:49 UTC

        Sorry, I haven\'t saw this thread (labelled Solaris...). Usually I received two or three WUs and of course because now they are longer I received 12 WUs....;-)


        ____________

        Profile Nightbird
        Forum moderator
        Avatar
        Send message
        Joined: Jul 12 05
        Posts: 920
        Credit: 114,924
        RAC: 0
        Message 1948 - Posted 23 Feb 2006 21:06:23 UTC - in response to Message 1947.

          Sorry, I haven\'t saw this thread (labelled Solaris...). Usually I received two or three WUs and of course because now they are longer I received 12 WUs....;-)


          I received 23 + 21 + 25 wus = 69 wus \"grin\"

          ____________

          Profile [B^S] thierry@home
          Avatar
          Send message
          Joined: Jul 12 05
          Posts: 98
          Credit: 255,646
          RAC: 0
          Message 1953 - Posted 23 Feb 2006 21:55:07 UTC

            I saw that but I have also some other WUs to finish ;-)

            ____________

            Profile Nightbird
            Forum moderator
            Avatar
            Send message
            Joined: Jul 12 05
            Posts: 920
            Credit: 114,924
            RAC: 0
            Message 1954 - Posted 23 Feb 2006 22:05:08 UTC - in response to Message 1953.

              I saw that but I have also some other WUs to finish ;-)

              Me too. ;)
              ____________

              Profile Etien1984
              Send message
              Joined: Oct 18 05
              Posts: 19
              Credit: 1,011,330
              RAC: 0
              Message 1955 - Posted 23 Feb 2006 22:06:40 UTC

                I like big WU-s :) (if i will got a credit:D)
                ____________


                Profile Ananas
                Send message
                Joined: Jul 12 05
                Posts: 222
                Credit: 665,833
                RAC: 0
                Message 1956 - Posted 23 Feb 2006 22:41:33 UTC

                  Agreed, I prefer the big ones too.

                  Trulayne
                  Avatar
                  Send message
                  Joined: Oct 16 05
                  Posts: 36
                  Credit: 11,377
                  RAC: 0
                  Message 1957 - Posted 24 Feb 2006 1:05:07 UTC

                    Long workunits are nice but when BOINC thinks they are short workunits and downloads too many for the deadline, it becomes a problem. I run multiple projects on my three computers and BOINC balances them very nicely when estimated times are close to correct. The new workunits are a good twenty times longer than Boinc thinks they are. This is causing Boinc to start into earliest deadline mode which will still not be fast enough useing 100% of the CPU to get these workunits done before the deadline. Please give these workunit a more accurate estimate so our other projects do not suffer.

                    ____________

                    Profile Etien1984
                    Send message
                    Joined: Oct 18 05
                    Posts: 19
                    Credit: 1,011,330
                    RAC: 0
                    Message 1959 - Posted 24 Feb 2006 8:06:28 UTC - in response to Message 1957.

                      Long workunits are nice but when BOINC thinks they are short workunits and downloads too many for the deadline, it becomes a problem. I run multiple projects on my three computers and BOINC balances them very nicely when estimated times are close to correct. The new workunits are a good twenty times longer than Boinc thinks they are. This is causing Boinc to start into earliest deadline mode which will still not be fast enough useing 100% of the CPU to get these workunits done before the deadline. Please give these workunit a more accurate estimate so our other projects do not suffer.


                      I got 5 WU-s and the deadtime is 2006.02.27. Each WU ran 9 hours. (I think, because the first was runing 9 hour :)) So, the 5 WU will finisched in 2 days. (My computer is always runing.) So, when you have no time to finisch some WU-s, give it tom me :)
                      ____________


                      Bill Hounslow
                      Send message
                      Joined: Aug 20 05
                      Posts: 10
                      Credit: 10,659
                      RAC: 0
                      Message 1963 - Posted 24 Feb 2006 18:36:11 UTC - in response to Message 1957.

                        Long workunits are nice but when BOINC thinks they are short workunits and downloads too many for the deadline, it becomes a problem. I run multiple projects on my three computers and BOINC balances them very nicely when estimated times are close to correct. The new workunits are a good twenty times longer than Boinc thinks they are. This is causing Boinc to start into earliest deadline mode which will still not be fast enough useing 100% of the CPU to get these workunits done before the deadline. Please give these workunit a more accurate estimate so our other projects do not suffer.


                        I\'ll endorse that.

                        One of my machines is now hopelessly swamped so I can either let it run in \'earliest deadline\' mode, and allow other Boinc projects to go past their deadline too, or suspend SZTAKI and allow some 40 WU\'s to expire. Obviously, I\'ll do the latter.

                        But a realistic completion time would have made everyone happy and prevented SZTAKI hogging freely-given computer time...





                        ____________

                        Trulayne
                        Avatar
                        Send message
                        Joined: Oct 16 05
                        Posts: 36
                        Credit: 11,377
                        RAC: 0
                        Message 1969 - Posted 25 Feb 2006 1:51:45 UTC

                          A longer deadline would also help when running these longer workunits.

                          ____________

                          Robert Nelson
                          Send message
                          Joined: Jul 12 05
                          Posts: 9
                          Credit: 419,683
                          RAC: 0
                          Message 1970 - Posted 25 Feb 2006 2:57:48 UTC

                            Similar problem here, the system got adjusted for work units that ran in less than 30 minutes, some cases as few as 10. So now I have received an absolute overload of work units which say will run in less than 30 minutes but take 10 to 12 hours. There is no way that they all can be completed by the deadline.
                            ____________

                            Profile UBT - Halifax--lad
                            Avatar
                            Send message
                            Joined: Sep 10 05
                            Posts: 126
                            Credit: 3,147
                            RAC: 0
                            Message 1971 - Posted 25 Feb 2006 14:53:20 UTC - in response to Message 1970.

                              Similar problem here, the system got adjusted for work units that ran in less than 30 minutes, some cases as few as 10. So now I have received an absolute overload of work units which say will run in less than 30 minutes but take 10 to 12 hours. There is no way that they all can be completed by the deadline.


                              Simple just abort the ones that won\'t finish and leave BOINC to the rest it will learn again that the WU\'s are long

                              2 weeks or so and an optimised version will be ready for the 12th dimenson so that should help
                              ____________
                              Join us in Chat (see the forum) Click the Sig


                              Join UBT

                              Profile Nightbird
                              Forum moderator
                              Avatar
                              Send message
                              Joined: Jul 12 05
                              Posts: 920
                              Credit: 114,924
                              RAC: 0
                              Message 1972 - Posted 25 Feb 2006 15:34:11 UTC

                                Last modified: 25 Feb 2006 15:39:58 UTC

                                If the deadline is not extented quickly, you will need to abort the wus you can\'t finish for 27 feb.
                                ____________

                                Profile Bob Carlton
                                Avatar
                                Send message
                                Joined: Jan 13 06
                                Posts: 18
                                Credit: 9,224
                                RAC: 0
                                Message 1973 - Posted 25 Feb 2006 16:43:29 UTC - in response to Message 1972.

                                  If the deadline is not extented quickly, you will need to abort the wus you can\'t finish for 27 feb.


                                  I already sent a few back aborted, the switch happened without warning so I got caught with 44 new workunits, will send back about 20 by the time the deadline comes, unless it is extended.
                                  ____________
                                  member of

                                  Profile Ananas
                                  Send message
                                  Joined: Jul 12 05
                                  Posts: 222
                                  Credit: 665,833
                                  RAC: 0
                                  Message 1974 - Posted 25 Feb 2006 17:07:15 UTC

                                    Last modified: 25 Feb 2006 17:11:13 UTC

                                    If the estimated time will not be fixed very soon, I will not have much choices other than detach, at least on some machines.

                                    As SZTAKI isn\'t my only project, reducing the cache size to 0.05 days can only be a temporary workaround. I used to have 0.35, which is already very small, but that was still too much, as some SZTAKI results will take 16+ hours (MP2600+) with an estimated time of 1.25 hours :-/

                                    I do like the bigger WUs but they shouldn\'t camouflage their size.

                                    Profile Nightbird
                                    Forum moderator
                                    Avatar
                                    Send message
                                    Joined: Jul 12 05
                                    Posts: 920
                                    Credit: 114,924
                                    RAC: 0
                                    Message 1975 - Posted 25 Feb 2006 18:03:13 UTC - in response to Message 1974.

                                      Last modified: 25 Feb 2006 23:13:01 UTC

                                      If the estimated time will not be fixed very soon, I will not have much choices other than detach, at least on some machines.

                                      As SZTAKI isn\'t my only project, reducing the cache size to 0.05 days can only be a temporary workaround. I used to have 0.35, which is already very small, but that was still too much, as some SZTAKI results will take 16+ hours (MP2600+) with an estimated time of 1.25 hours :-/

                                      I do like the bigger WUs but they shouldn\'t camouflage their size.

                                      We have all the same problem for now and obviously during this weekend nothing will be corrected.
                                      Extending the deadline is only a last resort but at least that would help with wus being sent.


                                      ____________

                                      Profile [B^S] thierry@home
                                      Avatar
                                      Send message
                                      Joined: Jul 12 05
                                      Posts: 98
                                      Credit: 255,646
                                      RAC: 0
                                      Message 1976 - Posted 25 Feb 2006 21:38:00 UTC

                                        Last modified: 25 Feb 2006 21:38:07 UTC

                                        Normaly when you finished some WUs, the estimated completion time is updated. After the 12 WUs received yesterday, I received today WU by WU with a cache of 0.1 days with an estimated time of 14 hours.

                                        ____________

                                        madmac
                                        Avatar
                                        Send message
                                        Joined: Sep 22 05
                                        Posts: 27
                                        Credit: 1,483
                                        RAC: 0
                                        Message 1990 - Posted 26 Feb 2006 8:06:31 UTC

                                          I too have a long Wu but my problem is that it did 5 hrs, then the next day it went back to zero even though my preferences are to leave in memory. I thought that this problem had been fixed or do I did to do something with memory again.If no answer by tonight I will abort it.
                                          ____________

                                          madmac
                                          Avatar
                                          Send message
                                          Joined: Sep 22 05
                                          Posts: 27
                                          Credit: 1,483
                                          RAC: 0
                                          Message 1992 - Posted 26 Feb 2006 11:14:49 UTC

                                            Further information on my problem CPU time went back to zero but precentage after one hour today was past it old 30% mark, so what is going on?
                                            ____________

                                            Profile Ananas
                                            Send message
                                            Joined: Jul 12 05
                                            Posts: 222
                                            Credit: 665,833
                                            RAC: 0
                                            Message 1996 - Posted 26 Feb 2006 14:13:00 UTC

                                              Last modified: 26 Feb 2006 14:15:06 UTC

                                              It does that all the time, preferably on 2 and 4 CPU machines.

                                              I will finish some WUs now, intercepting the traffic to the sztaki server (through an entry in /etc/hosts), fix the <final_cpu_time> entries in client_state.xml when they are done and then allow contact to the server again.

                                              This might help but I haven\'t tried it yet. It will not damage anything, that I\'m quite sure about. Would be a shame to receive 0 credits for a bunch of 16 hours WU.

                                              Ageless
                                              Avatar
                                              Send message
                                              Joined: Jul 12 05
                                              Posts: 64
                                              Credit: 3,001
                                              RAC: 0
                                              Message 1999 - Posted 26 Feb 2006 14:53:33 UTC

                                                Last modified: 26 Feb 2006 14:53:50 UTC

                                                Just aborted 7 SZTAKI units and put the project on No New Tasks. As per direct, it\'s no longer possible to run it in combination with other long running units, because of that stupidly short deadline. Even the unit that had just started needed to be in within 11 hours. Problematic if the units run for 12 hours and more.

                                                And no, the cache I run isn\'t that big. 0.2 days worth. (that was 8 units of 30 minutes that it guessed... )

                                                ____________
                                                Jord - "Looking for an answer solves 90% of the questions asked!"

                                                Boinc Wiki - Information at your fingertips.

                                                KWSN Sir Clark
                                                Avatar
                                                Send message
                                                Joined: Aug 1 05
                                                Posts: 17
                                                Credit: 31,037
                                                RAC: 0
                                                Message 2008 - Posted 26 Feb 2006 17:34:16 UTC

                                                  I crunch all the projects you see in my sig on a single machine with equal resource shares for all. I crunch the occasional unit for short WU projects on my parents PC when they aren\'t going to use it, but not Sztaki anymore.

                                                  I had some units which were due tomorrow but have aborted them. I\'ve suspended all other projects downloaded a couple of new WUs with a deadline a couple of days away and will see if the estimated time goes up.

                                                  If it doesn\'t, I\'ll put Sztaki on hold until the deadline increases.

                                                  What would you suggest for a new deadline, 2 weeks or 3 weeks???
                                                  ____________


                                                  www.chris-kent.co.uk aka Chief.com

                                                  Profile paul and kirsty yates
                                                  Avatar
                                                  Send message
                                                  Joined: Jan 1 06
                                                  Posts: 11
                                                  Credit: 811
                                                  RAC: 0
                                                  Message 2009 - Posted 26 Feb 2006 17:47:46 UTC

                                                    7 hours and 90% done all other projects on hold (7 at moment )only said 2 hours to compleation when i started it this morning as luck would have it rest of projects dont need to be in till at least 3.3.06
                                                    so i should be ok with it but will be no new work at the end till it sounds like its been sorted
                                                    sorry but i share the time fairly with the other projects except ufluids which gets sataday to itself (but doesnt run during the week due to no checkpoints)

                                                    hope its sorted soon
                                                    ____________

                                                    Profile Nightbird
                                                    Forum moderator
                                                    Avatar
                                                    Send message
                                                    Joined: Jul 12 05
                                                    Posts: 920
                                                    Credit: 114,924
                                                    RAC: 0
                                                    Message 2014 - Posted 26 Feb 2006 19:10:24 UTC

                                                      Last modified: 26 Feb 2006 20:00:09 UTC

                                                      my experience with my Barton 2500+ :

                                                      ad28c637-cb25-4621-afc0-ee4cc4c4a4c1
                                                      27,415.00 sec (7h36min..)

                                                      71a17ad6-a56e-4b84-9c05-be13c8f6fb83
                                                      12,017.00 sec (3h20min..)

                                                      9d2c34cd-bf26-4be5-bb66-c2f3f48d70a9
                                                      11,454.00 sec (3h10min..)

                                                      3 wus created feb. 25th
                                                      -----------------
                                                      a41a726f-79b7-4681-b29b-ddc94a105ff5_1
                                                      46,540.00 sec (10h59 min..)

                                                      wu created feb 23 th
                                                      -----------------
                                                      problems :
                                                      - i\'m running Sztaki as a priority (all projects suspended)
                                                      - i aborted 47 wus (all created 23 feb. -> deadline 27 feb.)
                                                      - estimated time : ~ 1h20 - 1h35
                                                      - \"little\" cache : 1 day (usually 1.5 day)
                                                      - next deadlines : Simap (2 Mar.) ; Predictor (3 Mar.) ; Rosetta (1 Mar.)
                                                      ------------------
                                                      I\'m waiting the next application which will work better with the dimension 12.

                                                      ____________

                                                      Robert Nelson
                                                      Send message
                                                      Joined: Jul 12 05
                                                      Posts: 9
                                                      Credit: 419,683
                                                      RAC: 0
                                                      Message 2023 - Posted 26 Feb 2006 21:31:10 UTC - in response to Message 1971.

                                                        Last modified: 26 Feb 2006 21:33:12 UTC

                                                        Similar problem here, the system got adjusted for work units that ran in less than 30 minutes, some cases as few as 10. So now I have received an absolute overload of work units which say will run in less than 30 minutes but take 10 to 12 hours. There is no way that they all can be completed by the deadline.


                                                        Simple just abort the ones that won\'t finish and leave BOINC to the rest it will learn again that the WU\'s are long

                                                        2 weeks or so and an optimised version will be ready for the 12th dimenson so that should help



                                                        Well just did the hatefull and aborted about 100 work units. It is too bad that the venue problem got fixed before the new work units came out. If the bug had been still with us the number of work units sent out would not have been that great when work was requested , giving the software time to adjust estimate for work unit duration before the load hit, thus limiting the flood. With Boinc thinking that the work units would take 10 to 30 minutes depending on machine, they got the equivalent, unfortunetly the base system was 600 to 720 minutes not the 10 or 30. Those 100 work units would have been easily handled before. Did note that the estimated time after the damage had been done was adjusted by the client so they were not showing the 10 or 30 minutes but indeed the 10 to 12 hours. Maybe in the future when a new lot of work units is scheduled to come out, especially if the last batch had short durations, a temporary limitation on work unit distribution until the clients have adjusted to the work length time may be in order.
                                                        ____________

                                                        Profile Ananas
                                                        Send message
                                                        Joined: Jul 12 05
                                                        Posts: 222
                                                        Credit: 665,833
                                                        RAC: 0
                                                        Message 2026 - Posted 27 Feb 2006 2:54:46 UTC

                                                          Last modified: 27 Feb 2006 3:08:08 UTC

                                                          I had reset SZTAKI on one box as it had too many long running WUs - just to receive long running WUs again with a completely wrong estimated time. I had to detach that one, it makes no sense to keep crunching those WUs just for the trashbin :-(


                                                          P.s.: Waiting for the client to adjust the estimated time isn\'t a good idea. Older clients will never adjust and if there\'s much variation in the time, it will never be OK.

                                                          This adjusting feature is sure not thought to repair broken project configurations, it\'s more likely a bugfix for the broken BOINC benchmark.

                                                          Profile Bruno G. Olsen & ESEA @ greenholt
                                                          Avatar
                                                          Send message
                                                          Joined: Jul 12 05
                                                          Posts: 16
                                                          Credit: 19,160
                                                          RAC: 0
                                                          Message 2039 - Posted 27 Feb 2006 14:36:38 UTC

                                                            Just aborted a bunch past deadline. Curious about a finished wu that was returned too late - if it\'s going to be used.


                                                            ____________

                                                            madmac
                                                            Avatar
                                                            Send message
                                                            Joined: Sep 22 05
                                                            Posts: 27
                                                            Credit: 1,483
                                                            RAC: 0
                                                            Message 2048 - Posted 27 Feb 2006 17:19:29 UTC

                                                              The one I aborted after 10 hrs Cpu time only shown 5 and still had 60% to go, it would not reach the deadline so I aborted it.
                                                              ____________

                                                              John McLeod VII
                                                              Avatar
                                                              Send message
                                                              Joined: Jul 12 05
                                                              Posts: 63
                                                              Credit: 140,300
                                                              RAC: 0
                                                              Message 2089 - Posted 2 Mar 2006 4:52:40 UTC - in response to Message 2026.

                                                                I had reset SZTAKI on one box as it had too many long running WUs - just to receive long running WUs again with a completely wrong estimated time. I had to detach that one, it makes no sense to keep crunching those WUs just for the trashbin :-(


                                                                P.s.: Waiting for the client to adjust the estimated time isn\'t a good idea. Older clients will never adjust and if there\'s much variation in the time, it will never be OK.

                                                                This adjusting feature is sure not thought to repair broken project configurations, it\'s more likely a bugfix for the broken BOINC benchmark.

                                                                The Duration Correction Factor will catch either a badly estimated time, or a BOINC Benchmark that is off. However, the first result has to be processed in order to have any idea what is happening. As a safety, it works as well as it can - move agressively upwards in the estimate and cautiously down. It does not work quite as well if there are a very few very long running results.

                                                                What is needed is for the project administrators to try a new application on a few guinea pigs first. This is not the first project with this problem, nor is it the worst case (that is so far held by a project that had estimates that were a factor of 900 too low and nine hundred was not a typo).
                                                                ____________


                                                                BOINC WIKI

                                                                Profile Ananas
                                                                Send message
                                                                Joined: Jul 12 05
                                                                Posts: 222
                                                                Credit: 665,833
                                                                RAC: 0
                                                                Message 2091 - Posted 2 Mar 2006 7:18:36 UTC

                                                                  Last modified: 2 Mar 2006 7:32:27 UTC

                                                                  Can this factor fix something if a lot of long running results return with 0 seconds reported CPU time? For many long running results the client has reset the CPU time just when they were done (CC4.19).


                                                                  p.s.: One box I have not detached and still run SZTAKI with a low priority there, so I can see when the problems are gone. It\'s my guinea pig ;-)

                                                                  It would be good if the project preferences had one checkbox for \"Accept test WUs\" and feeder/scheduler would send test WUs only to those clients with this flag set. This would allow testing in the main project without much trouble.

                                                                  madmac
                                                                  Avatar
                                                                  Send message
                                                                  Joined: Sep 22 05
                                                                  Posts: 27
                                                                  Credit: 1,483
                                                                  RAC: 0
                                                                  Message 2093 - Posted 2 Mar 2006 12:02:53 UTC

                                                                    I have had to abort 3 because of the time factor. After 10 hrs work I got 0 credit, why because I had to switch the machine off and the CPU time went back to zero. I thought that this problem had been fixed but it seems that it has reared its ugly head again.
                                                                    ____________

                                                                    Dimmerjas
                                                                    Send message
                                                                    Joined: Feb 28 06
                                                                    Posts: 1
                                                                    Credit: 273
                                                                    RAC: 0
                                                                    Message 2115 - Posted 3 Mar 2006 20:18:20 UTC

                                                                      This project sucks!

                                                                      Too long WU\'s and way too short time to finish them.

                                                                      If Bonic Manager is shot down, the CPU-time used on a WU returns to 0 when then Manager is opened again. Resulting in 0 credit when the WU is returned????

                                                                      And 0 credit is given, if too many results is returned for the same WU. Like 7 results. Even if one result is a \"Computation\"-result, and 2 result has passed their Deadline. But 4 \"Correct\" result has been returned. But no Credit to these 4 anyway???? A lot of Computation done by the 4 for no reason at all.

                                                                      That was my experience with this project. A lot of CPU-time used, but no out-
                                                                      come. Due to something made by left-hand-programmers.

                                                                      Adios Amigos!

                                                                      Profile Ananas
                                                                      Send message
                                                                      Joined: Jul 12 05
                                                                      Posts: 222
                                                                      Credit: 665,833
                                                                      RAC: 0
                                                                      Message 2116 - Posted 3 Mar 2006 21:13:42 UTC - in response to Message 2115.

                                                                        Last modified: 3 Mar 2006 21:19:14 UTC

                                                                        ...
                                                                        And 0 credit is given, if too many results is returned for the same WU. Like 7 results. Even if one result is a \"Computation\"-result, and 2 result has passed their Deadline. But 4 \"Correct\" result has been returned. But no Credit to these 4 anyway???? A lot of Computation done by the 4 for no reason at all.
                                                                        ...



                                                                        With max # of total results set to 6, it shouldn\'t even deliver a 7th result. And if the required number of correct ones are available, it should ignore all other problems caused by deadline or max. # of anything and just validate.

                                                                        But this is not a SZTAKI bug, it\'s a bug in the feeder and/or in the assimilator, basic BOINC components. I have seen that one in several projects.

                                                                        Post to thread

                                                                        Message boards : Észrevételek, tapasztalatok : Long WUs


                                                                        Home | My Account | Message Boards


                                                                        Copyright © 2017 SZTAKI Desktop Grid