Host with 1,348 WUs?!?


Advanced search

Message boards : SZTAKI Desktop Grid : Host with 1,348 WUs?!?

AuthorMessage
miketoth1001
Avatar
Send message
Joined: Apr 3 06
Posts: 20
Credit: 27,967
RAC: 0
Message 5366 - Posted 1 Jan 2007 12:04:02 UTC

    You might want to look at the below host:

    http://szdg.lpds.sztaki.hu/szdg/show_host_detail.php?hostid=10159

    He has 1,348 WUs, and not a single returned WU. These go back to 12 Dec 2006 16:21:54 UTC. This might be where all the problems are stemming from, with WUs either aborted by the user, or missing deadline.

    As I can\'t EVER see even a very fast machine crunching all of these WUs by deadline, and he\'s running this on a GenuineIntel Pentium II (Deschutes) with 1 CPU with 566.33 MB of memory, I would consider these WUs basically dead. Even though the first one won\'t hit deadline until 9 Jan 2007 16:21:54 UTC. That means 8 more days until the system sees that the WUs will need to be re-issued, and then they have to queue up. To me, this amount of WUs looks suspicious.

    You might want to talk to the Admin at RenderFarm@Home. He installed a way for users to get WUs, but not too many at a time. Might help out the problems here. For me, I get 4 WUs at a time. Spreads out the WUs among the users, and ensures that no one should be able to do this type of thing.

    Hope this helps.
    ____________

    Profile Nightbird
    Forum moderator
    Avatar
    Send message
    Joined: Jul 12 05
    Posts: 920
    Credit: 114,924
    RAC: 0
    Message 5367 - Posted 1 Jan 2007 12:34:40 UTC - in response to Message 5366.

      Last modified: 1 Jan 2007 12:39:17 UTC

      You might want to look at the below host:

      http://szdg.lpds.sztaki.hu/szdg/show_host_detail.php?hostid=10159

      He has 1,348 WUs, and not a single returned WU. These go back to 12 Dec 2006 16:21:54 UTC. This might be where all the problems are stemming from, with WUs either aborted by the user, or missing deadline.

      As I can\'t EVER see even a very fast machine crunching all of these WUs by deadline, and he\'s running this on a GenuineIntel Pentium II (Deschutes) with 1 CPU with 566.33 MB of memory, I would consider these WUs basically dead. Even though the first one won\'t hit deadline until 9 Jan 2007 16:21:54 UTC. That means 8 more days until the system sees that the WUs will need to be re-issued, and then they have to queue up. To me, this amount of WUs looks suspicious.

      You might want to talk to the Admin at RenderFarm@Home. He installed a way for users to get WUs, but not too many at a time. Might help out the problems here. For me, I get 4 WUs at a time. Spreads out the WUs among the users, and ensures that no one should be able to do this type of thing.

      Hope this helps.

      This is not the best or ideal solution (i guess) since Sztaki has always wus to sent.
      On RenderFarm@home, he did it because the tweaked scheduler \"avoids some people getting a hundred units while others get nothing\". [i quote him]
      Problems here : (perhaps) large cache size, perhaps long debt term and perhaps this user doesn\'t look his boinc manager or/and doesn\'t care.
      I asked to Adam to decrease the daily wu quota (presently 100/day/cpu) and frankly, i think that sooner or later the project will need to shorten the deadline (to get wus re-sent faster).

      edit :
      same problem here
      ____________

      miketoth1001
      Avatar
      Send message
      Joined: Apr 3 06
      Posts: 20
      Credit: 27,967
      RAC: 0
      Message 5386 - Posted 2 Jan 2007 21:31:43 UTC

        Even if the user(s) had a large cache size, the most is 10 days. As this has been going on since the 12th of December, it should have uploaded 1 result. Same if they had the portion about connecting to the network every ## days, it\'s 10 max. It\'s connecting every day, and downloading new WUs. I won\'t say it, but I think we all know what\'s going on. Maybe it\'s an irate ex-user. Who knows.

        Decreasing the amount of WUs issued may help a bit, but there will still be dead WUs issued. Maybe a ban of the IP, until it\'s straightened out. There are several users this is happening with.

        Nicolas
        Avatar
        Send message
        Joined: Dec 10 06
        Posts: 1
        Credit: 1,335
        RAC: 0
        Message 5387 - Posted 2 Jan 2007 21:41:43 UTC - in response to Message 5367.

          On RenderFarm@home, he did it because the tweaked scheduler \"avoids some people getting a hundred units while others get nothing\". [i quote him]

          On Renderfarm I used 2 or 3 workunits per host because of my needs, no more than a thousand units that I need evenly distributed so as to get them back as fast as possible.

          But maybe a 500-workunit limit would help here. Daily quota helps, but as the name says, it\'s daily. A user could get a hundred units per day during 10 days. This is a \"WU per host\" limit, you can\'t get more until you return these back (or they reach deadline and are canceled on server).
          ____________

          Richard Maths 1
          Send message
          Joined: Aug 16 05
          Posts: 43
          Credit: 9,526
          RAC: 0
          Message 5388 - Posted 4 Jan 2007 4:34:12 UTC

            This computer

            http://szdg.lpds.sztaki.hu/szdg/show_host_detail.php?hostid=10159

            is now up to 1542 WU\'s
            and is picking up 10 Wu\'s each day it is turned on.

            And has O.00 credits total. Posssibly none of the WU\'s worked on - just over ran the deadline.

            So for all the others that are waiting for their pending credits on these 1542 WU\'s, it will be a long time coming. :-(

            It looks as if the person has not checked this particular computer for its production.

            Is there any way for the project to contact this person?

            If there is a way to stop that computer from getting any more WU\'s and possibly gettng those Wu\'s back into circulation, it may help the project.

            I wonder if there are other computers doing the same. No wonder the pending credits are being held up.

            Thanks,
            Richard

            ____________

            Richard Maths 1
            Send message
            Joined: Aug 16 05
            Posts: 43
            Credit: 9,526
            RAC: 0
            Message 5395 - Posted 5 Jan 2007 2:56:58 UTC

              Last modified: 5 Jan 2007 2:58:48 UTC

              I was wrong about this one:

              \"http://szdg.lpds.sztaki.hu/szdg/show_host_detail.php?hostid=10159\"

              It is picking up 50 or so WU\'s a day, Not 10 as I said before.

              Right now it is at 1597, an increase of 55 since my last note.

              This person has 10 computers.

              This computer is the only active one for Sztaki and its only contacting Sztaki for more WU\'s every day and not getting any credit for any that it may be processing (it is my opinion that it is not processing any at all - but I have not had time to dig down through 1500 WU\'s listed to find out.)

              Adam or someone needs to look at this since this one computer is holding up the credits for many others.
              Somebody could be processing these 1597 WU\'s for the project.

              Thanks,
              Richard
              ____________

              Richard Maths 1
              Send message
              Joined: Aug 16 05
              Posts: 43
              Credit: 9,526
              RAC: 0
              Message 5398 - Posted 6 Jan 2007 5:54:54 UTC - in response to Message 5366.

                Last modified: 6 Jan 2007 5:57:44 UTC

                You might want to look at the below host:

                http://szdg.lpds.sztaki.hu/szdg/show_host_detail.php?hostid=10159

                He has 1,348 WUs, and not a single returned WU. These go back to 12 Dec 2006 16:21:54 UTC. This might be where all the problems are stemming from, with WUs either aborted by the user, or missing deadline.

                My comment:
                __________________________________________________________________
                -- Now at 1663 WU\'s and going up every day.
                ___________________________________________________________________

                As I can\'t EVER see even a very fast machine crunching all of these WUs by deadline, and he\'s running this on a GenuineIntel Pentium II (Deschutes) with 1 CPU with 566.33 MB of memory, I would consider these WUs basically dead. Even though the first one won\'t hit deadline until 9 Jan 2007 16:21:54 UTC. That means 8 more days until the system sees that the WUs will need to be re-issued, and then they have to queue up. To me, this amount of WUs looks suspicious.

                My next comment:
                _________________________________________________________________________-

                The deadline of the last group is now Feb.3. And everyone of them will miss the deadline - none are being processed!

                Anyone having the same WU\'s will have to wait a long time before any of these get reissued so that they may get credit.
                And just think of the \"science\" being lost to this time delay.

                At a rate of at least 50 WU\'s each day added to the above number of 1663, the total will be about 3300 WU\'s with the last ones with a deadline of about Feb.28.

                And just think how long it will be before these are reissued and processed...
                Then there is always the possibility that this same computer will pick them up again.

                Help!! Any comments? Anyone that can do something about this out there?

                Thanks,
                Richard
                _________________________________________________________________
                End of my comments.

                You might want to talk to the Admin at RenderFarm@Home. He installed a way for users to get WUs, but not too many at a time. Might help out the problems here. For me, I get 4 WUs at a time. Spreads out the WUs among the users, and ensures that no one should be able to do this type of thing.

                Hope this helps.


                ____________

                miketoth1001
                Avatar
                Send message
                Joined: Apr 3 06
                Posts: 20
                Credit: 27,967
                RAC: 0
                Message 5402 - Posted 7 Jan 2007 1:18:35 UTC

                  Yeah. Any computer found with that many unprocessed WUs should have their IP banned, and account deleted. I have no idea what this person is doing, or why they\'re doing it, but it has to stop. I do understand that sometimes a configuration error, or a file system error could hose some WUs, but they would be sent back with an error message.

                  I\'ve seen several throughout my pending credit WUs with hosts that have from 300 to this ungodly number of WUs just dying out there. To me, it looks like they download the WUs, and then reset the project. If they aborted the WUs, then there should be SOME message back from even one of them to that end.

                  But, as I said before, the custom scheduler from RenderFarm would stop this, as it\'s server side, and not client side. They simply would not get any more WUs. It also would stop people from being slammed with dozens of WUs. They would get, at most, 2 WUs per CPU.

                  Right now, the only way I see to really get this problem under control is to stop making new WUs, and concentrate on getting what\'s out there finished. Once it\'s caught up, then precautions could be put in place. Also, a deadline of 1 to 2 weeks would be good.

                  Right now, this project is getting a bad rep on other forums. It\'s sad.
                  ____________

                  miketoth1001
                  Avatar
                  Send message
                  Joined: Apr 3 06
                  Posts: 20
                  Credit: 27,967
                  RAC: 0
                  Message 5407 - Posted 7 Jan 2007 15:02:18 UTC

                    That host is now, as of the time this post was made, up to 1750 WUs. And still not a credit to it\'s name.

                    Anyway. I noticed in my recommendations I forgot one critical part. Not only a shorter deadline, but change the Initial replication to either 4 or 5. This way, if 2 hosts do have errors, or just kill the WUs, the other 3 should finish the WU and return the results. Check this WU to see what I mean. It\'s been sent out 11 times, and the weird thing is that 2 P4s actually crunched for a longer period of time then my P3 Coppermine.

                    Another one is this one. It\'s been replicated 9 times, with another WU ready to go out. Problem is, it doesn\'t look like it\'s gonna get picked up real soon.

                    Anyway. I have to blow away C drive due to some file system errors that have been getting worse and worse, so until I get XP on this machine back up and fully patched, no more crundhing for me on this box. Hopefully I\'ll be back up by 4 PM my time. BOINC is on a seperate drive, anyway.

                    robert.mouris
                    Send message
                    Joined: Nov 3 05
                    Posts: 129
                    Credit: 4,124,189
                    RAC: 0
                    Message 5409 - Posted 7 Jan 2007 16:44:48 UTC - in response to Message 5407.

                      Last modified: 7 Jan 2007 16:48:19 UTC

                      change the Initial replication to either 4 or 5

                      I would not like a higher initial replication number. It is true that in the short run some WUs will get their quorum earlier (most WUs are without problem and get their quorum right from the start, especially the new ones), but in the long run it will be a waste of resources. We would waste 25 or 40 % of our CPU cycles.

                      In my opinion, high initial replication numbers are justified for projects where results are needed quickly for the next phase of the project (LHC@home, Chess960@home...), but here we are not in a rush. The mathematicians of Eötvös Loránd University can probably not start their analysis of our results for a given dimension before we have sent in all the results. And if we crunch 25 or 40 % of our time for the garbage bin, the scientific progress will be delayed. This is according to me a too high price to pay just to get results faster from the \"initial\" to the \"valid\" state in the validation process.
                      ____________

                      miketoth1001
                      Avatar
                      Send message
                      Joined: Apr 3 06
                      Posts: 20
                      Credit: 27,967
                      RAC: 0
                      Message 5509 - Posted 21 Jan 2007 3:51:22 UTC

                        OK. I\'ve been waiting for an update, but it hasn\'t come. So, I\'ll give mine.

                        The host in question is running Linux. The 686 flavor Linux. It looks like this guy has several machines, all from the same image. I\'ve found the hosts on Seti, and I\'m sure he\'s elsewhere. Thing is, except for one machine with 0.04 credits, none of his 686 on a Deschutes has ANY credits. Why? Probably because he should be running the 386, plain vanilla version. If I recall correctly, I beileve the 686 is for P3 and above. I have found a few machines on here that are running the 386 code on P2s, with no problems.

                        This guy is still picking up WUs from here. He\'s up to 1853 WUs, and still counting. Fortunately, he\'s only able to get 1 to 2 WUs a day.

                        This WU is my oldest still pending. It\'s from September.

                        Fritz
                        Send message
                        Joined: Apr 3 06
                        Posts: 8
                        Credit: 38,256
                        RAC: 0
                        Message 5511 - Posted 21 Jan 2007 11:41:24 UTC

                          I\'ve got at least one of the affected WUs on my list ... his is due to age out in a couple of hours.

                          Down to 1108 WUs waiting to age out. Currently gaining an average of 1 a day so that number will fall quickly to around 28 as the January deadline WUs age.

                          Average Turnaround time 28 days
                          Total Credit 0
                          Recent Average Credit 0

                          700+ listed now as No Reply
                          0 listed as completed with success or error
                          ____________

                          Fritz
                          Send message
                          Joined: Apr 3 06
                          Posts: 8
                          Credit: 38,256
                          RAC: 0
                          Message 5512 - Posted 21 Jan 2007 11:46:11 UTC - in response to Message 5509.

                            OK. I\'ve been waiting for an update, but it hasn\'t come. So, I\'ll give mine.

                            The host in question is running Linux. The 686 flavor Linux. It looks like this guy has several machines, all from the same image. I\'ve found the hosts on Seti, and I\'m sure he\'s elsewhere. Thing is, except for one machine with 0.04 credits, none of his 686 on a Deschutes has ANY credits. Why? Probably because he should be running the 386, plain vanilla version. If I recall correctly, I beileve the 686 is for P3 and above. I have found a few machines on here that are running the 386 code on P2s, with no problems.

                            This guy is still picking up WUs from here. He\'s up to 1853 WUs, and still counting. Fortunately, he\'s only able to get 1 to 2 WUs a day.

                            This WU is my oldest still pending. It\'s from September.


                            It\'s ancient history now, but under the old naming system Pentium=586 & Pentium II was the 686. So it\'s possible that he just installed what he thinks is a match for his CPU and has never bothered to look at the results. Perhaps a few emails from the admins at the various projects he\'s signed up for may get the correct version of BOINC installed.
                            ____________

                            Gaurav
                            Send message
                            Joined: Sep 4 06
                            Posts: 22
                            Credit: 52,275
                            RAC: 0
                            Message 5649 - Posted 5 Feb 2007 4:00:20 UTC

                              Last modified: 5 Feb 2007 4:00:46 UTC

                              There is this another host with 1042 wu\'s and that host and my host has atleast 15 commo wu\'s.
                              Now I don\'t see him returing that wu in a month. His last successful result was submitted on dec 30th 2006. Host\'s should not be allowed to download more than 15 wus at a time.



                              note: during the time it took me to write this post this host\'s wus went up to 1072

                              ____________

                              Post to thread

                              Message boards : SZTAKI Desktop Grid : Host with 1,348 WUs?!?


                              Home | My Account | Message Boards


                              Copyright © 2017 SZTAKI Desktop Grid