How to – build CUDA-based aionminer on Ubuntu 16.04

I would think this would be fine, but I cannot say for sure.

First, do you have mixed make cards on your machine? (some nvidia, some amd?)
Are you running OpenGL for any reason?

If the answer to both of those is “no”, then I don’t think that library will have any effect, since its sole purpose is to arbitrate OpenGL calls in mixed vendor environments. So yeah, overwrite it. It may be part of the required call chain, but it shouldn’t break anything to replace it if you are only dealing with nvidia hardware.

got it – thanks!

Just 2 nvidia 1080ti’s. Overwrote it… The Nouveau thing remains an issue. After the install via console, when I go to start lightdm and then reboot-- the login screen:

  1. isn’t formatted to the screen resolution
  2. will post back to the login screen upon password entry (every single time).

If I shoot back into console and run a sudo command to remove all nvidia files, I’m able to log back in no problem (and resolution goes back to normal).

Really sounds like I have 2 options:

  1. Somehow get ahold of root access so that I can make nouveau at least irrelevant (not even wiping it from system).
  2. Use the binary option – something you said above. Can you please share insights on this process? Or provide a link for details?

Cannot say it enough – Thanks a TON for these insights. First time building a computer, let alone mining. Been a great process, and with this last bit figured out I’ll be 100% good to go with this rig on the Aion test net (mining with the internal miner for the time being).

So, as long as you are running commands with sudo then you are running as root. I just wanted that to be clear, because without root privilege I don’t think you can do any of this stuff.

So, the login loop issue - I think one of the other forum members found a decent resource for resolving that. It’s actually in this thread (had to look), right up here. The only way I ever got past that issue was to remove all nvidia and cuda related stuff and start again.

Regarding your questions:

  1. If you can use sudo, then you have root access. Just preface commands with sudo if you want them to run as root (this is common for most/all linux distros, but this is actually the only way to do it in ubuntu without manually adding a root account.)
  2. The binary option is what you are trying to do - install the binary nvidia driver. Since the driver itself is not open-sourced, it is only distributed as a binary file. So, it is sometimes referred to as a binary driver since you cannot build it from source yourself.

So, that being said, I think you are on the right track. You just hit a snag with that login-loop issue, which is a weird one I don’t understand. It definitely has something to do with driver conflicts. So, yeah… this post has everything except removal of nouveau.

The official documentation says to create the blacklist file (the driver installer will do this for you if nouveau is loaded when you run it), update initramfs, reboot, and then you should be able to install the nvidia driver (again, following the instructions above). If you have issues and still need to manually remove nouveau, then try just sudo rmmod nouveau before trying the nvidia installer again.

Also, be sure and let the nvidia installer create your xorg,conf file at the end. That is needed for x to restart and properly use the driver.

"So, as long as you are running commands with sudo then you are running as root. I just wanted that to be clear, because without root privilege I don’t think you can do any of this stuff"
^So I’ve used sudo a fair bit, but hitting roadblocks when attacking the whole nouveau stuff (ex after using sudo a bunch it never worked when doing these things: https://askubuntu.com/questions/112302/how-do-i-disable-the-nouveau-kernel-driver). Perhaps something else is keeping me from making the changes considered in that thread? Not really relevant at this point I guess - just curious.

Great to hear that the login loop is more common than just me! Everything sounds right on track-- first thing I’ll do is attack the login loop issue and if all else fails, I’ll probably wipe the, reinstall ubuntu and attack the process as follows:

  1. Get the kernel up and going (updating to blockchain)
  2. Meanwhile, download binaries for driver. Get set up before the blockchain is synced. Being cognisent of login-loop issue + using that resource to manage it.
  • If all works out with the driver install, proceed as usual
  • If not, go after the whole nouveau thing (hoping not necessary with binary driver route).
  1. Go through your steps at the beginning of this thread, praying I get the 1080ti’s listed on the aion reference miner command!
  2. (question since this will be unchartered territory for me) – At this point, will the GPUs be getting work directed to them from the pool if all checks out? Before doing this-- should I set the aionminer config.xml mining tag to “false” ? (this was a step in the solo-mining-pool github run through).

Won’t be back to my rig until Thursday night, but I’m counting the hours… after all of this great help I want to get back to it ASAP!

I can see that. If the nouveau driver wasn’t already blacklisted, then you cannot rmmod it, even with sudo. You have to blacklist it first, reboot, then you can rmmod it (if needed) to get the nvidia driver installed.

Ahhhh okay got it. Thank you again for all of this-- if it’s cool by you, I’ll follow up here on Wednesday or Thursday when I’m back working on my rig!

Everything worked out great! Thank you again.

So now that I have the aion kernel, and the aionminer all set up-- how do I connect them? Do I set the “mining” to “false” in the aion config.xml file, run ./aion.sh, and then run ./run.sh form the aion_solo_pool folder?

That’s correct.
Disable mining in the config.xml file. Start the kernel (./aion.sh) and then the pool with ./run_quickstart.sh and then run the ./aionminer with the proper parameters for your setup.

So while in the /aion_solo_pool folder I ran ./run_quickstart.sh and it told me I had no ./redis/src/redis-server file. Should I run ./configure.sh? Was I in the wrong folder?

also, can you give me an example of the parameters necessary for the ./aionminer command? I’m running two 1080ti’s.

Would love to send you a dozen or so Aion to say thank you for all this help-- pls dm me an Eth address! (thinking erc 20 aion tokens!).

While in aion_solo_pool folder, first run ./configure.sh, then ./run_quickstart.sh. The pool should now start up and wait for incoming connections. Then go to aion_miner folder and run ./aionminer -cd 0 -cv 1 -cb 64 -ct 64. I’m not exactly sure what the best parameters are, or if you can run with 2 cards at the moment.

Also, I’m pretty sure everybody here is just here to help out, so no need to send aion tokens :slight_smile: (can’t speak for everybody of course)

2 Likes

So for two cards change -cd 0 to -cd 0 1 and with my 1080 ti I’ve found setting the blocks per device to 140 really helped.

./aionminer -cd 0 1 -cv 1 -cb 140 -ct 64 -u {0xacc}

2 Likes

Thank you guys so much-- very helpful stuff. That was my next best guess but after all this prep I didn’t want to potentially compromise what I’d thrown together so far.

One final hangup-- within the aion solo pool (while it’s running), I get the following message every few seconds:

2018-03-15 13:18:11 [Pool] [aion] (Thread 1) rpc error with daemon instance 0 when submitting block with submitblock {“type”:“request error”,“message”:“socket hang up”}

assuming something’s off in the poolconfig.xml file? Perhaps my port info? Many thanks in advance to anyone who can help me remedy this!

I haven’t experienced that error, so I cannot say for sure.

Just for comfirmation, you have the kernel and solo_pool both running on the same machine? And you are using all the default ports? I ask because RPC is how the pool talks to the kernel.

Also, as a data point, I changed nothing about my pool config. I ran ./configure.sh and ./run_quickstart.sh after starting the kernel and the pool just worked.

Thanks for these insights… Yeah, kernel and solo pool both are running on the same machine. Did the same as you in terms of just getting everything up without changing config files (apart from turning mining to false on the kernel).

I’m able to see visually within both the kernel config file and the pool config file that the host and port perfectly match.

The closest thread I can find to this issue is this, but it doesn’t have too much to offer in this case: https://github.com/zone117x/node-open-mining-portal/issues/499

Will keep digging- thanks again!

So a little update: I’ve tried to chop things up to best understand what any issue is.

Kernel-- all synced up, mined a block with cpu a few min ago before turning off internal mining. Opened it up and had it just syncing with the blockchain.

Aion_solo_pool: ran configure.sh and run_quickstart, and it was calmly waiting for a request for work from the GPUS

Aionminer: Ran the command, everything looked good to go. After a few instances of block notifications cia rpc polling, I get a red error still just like above.

I’ve NOT changed the ip address or anything from the out-of-the-box setup. Perhaps that’s a factor?

TL/DR – made sure it wasn’t the kernel, made sure it wasn’t the solo pool, but I can’t identify what the issue is within the miner itself despite seeing that there’s a disconnect causing the error.

Might be a bug though. If you think it is, you can post an issue on the aion_miner github and then a developer can have a look at it.

1 Like

Thanks Jim-- going through process again this AM. If it is, I’ll post it there for sure.

Any solutions for this?

@ Rachdingue

Please see the next couple posts…

This was the solution that seemed to work for me… I haven’t quite gotten enough time to set up the rest of my rig but I think the steps I took were in the correct direction.