If you haven’t heard by now, Joyent has recently open sourced SmartDataCenter and the Manta object storage platform. Most people probably don’t realize it yet but this is a really big deal. Mad props to everyone at Joyent involved in this massive effort!

Disclaimer: The method described below is NOT supported for use with Joyent SDC.

Since the announcement, there has been a steady stream of individuals drop by the #smartos IRC channel on Freenode asking whether it’s possible to convert their existing SmartOS installs to SDC. The official and definitive answer to this question is an emphatic no, as spelled out in the SmartDataCenter FAQ. I cannot stress this enough, and if there is any doubt about it, please stop by the IRC channel where you’ll be promptly pointed to this FAQ.

The purpose of this post is to evaluate the viability of what’s possible from a purely educational and technical perspective. Note that the FAQ says “Migration of instances from SmartOS to SDC compute nodes and between SDC compute nodes may be possible using vmadm(1m) and ZFS commands but is not a supported product feature and is not recommended.” This means you’re welcome to try but you get to hold onto the pieces when, not if, it breaks. If you’re thinking about doing this with anything you/your business considers production or important then stop reading now. You’ve been sufficiently warned and are proceeding at your own risk.

I have for a number of years run SmartOS on an all-in-one media server at home, and I wrote a piece titled Install SmartOS and keep your existing pool that explained how I migrated from a Nexenta based system to SmartOS without losing my existing data. I currently have a dedicated SDC installation in my basement on two Dell C1100s for testing purposes, but I’m interested in either migrating my existing SmartOS instances over to this, or preferably converting the SmartOS install to SDC while retaining the existing data on the pool. Below I show you how I have recently migrated an instance from my SmartOS system to my SDC install. Please note I have modified the hostnames used in the prompts to make it easier to identify which system I was on at the time.

[root@smartos ~]# vmadm list | grep mysqldb
55d81a4a-bac4-46b5-aee6-6894f103805c  OS    2048     running           mysqldb
[root@smartos ~]# vmadm stop 55d81a4a-bac4-46b5-aee6-6894f103805c
Successfully completed stop for VM 55d81a4a-bac4-46b5-aee6-6894f103805c
[root@smartos ~]# vmadm send 55d81a4a-bac4-46b5-aee6-6894f103805c | ssh 192.168.5.15 vmadm receive
Invalid value(s) for: image_uuid
Successfully sent VM 55d81a4a-bac4-46b5-aee6-6894f103805c
[root@sdc-cn1 (homelab) ~]# zfs list -rtall | grep 55d81a4a-bac4-46b5-aee6-6894f103805c
[root@sdc-cn1 (homelab) ~]#
[root@sdc-cn1 (homelab) ~]# vmadm list | grep mysqldb
[root@sdc-cn1 (homelab) ~]#

So I found the UUID of the instance I wanted to migrate, stopped it, and attempted to use the experimental/undocumented vmadm send/receive. It appeared to work based on the “successfully sent” message, but the “Invalid value(s) for: image_uuid” is key and we see that the zone and its dataset was not actually received on the SDC compute node. What this error means is that the image used by the source dataset is missing on the destination. We can find this information with the following command:

[root@smartos ~]# imgadm list | grep `vmadm get 55d81a4a-bac4-46b5-aee6-6894f103805c | json image_uuid`
dc0688b2-c677-11e3-90ac-13373101c543  base64     13.4.2     smartos  2014-04-17T21:33:04Z

Now we need to import this on the SDC compute node:

[root@headnode (homelab) ~]# sdc-imgadm import dc0688b2-c677-11e3-90ac-13373101c543 -S https://images.joyent.com
Imported image dc0688b2-c677-11e3-90ac-13373101c543 (base64, 13.4.2, state=active)
[root@sdc-cn1 (homelab) ~]# imgadm import dc0688b2-c677-11e3-90ac-13373101c543
Importing image dc0688b2-c677-11e3-90ac-13373101c543 ([email protected]) from "http://imgapi.homelab.zpool.org"
dc0688b2-c677-11e3-90ac-13373101c543           [======================================>] 100% 109.61MB  19.97MB/s     5s
Imported image dc0688b2-c677-11e3-90ac-13373101c543 ([email protected]) to "zones/dc0688b2-c677-11e3-90ac-13373101c543"
[root@sdc-cn1 (homelab) ~]# imgadm list | grep dc0688b2-c677-11e3-90ac-13373101c543
dc0688b2-c677-11e3-90ac-13373101c543  base64             13.4.2                                      smartos  2014-04-17T21:33:04Z

The first command imports the image from Joyent’s official image repository and caches it on the headnode. The second command imports the image on the compute node from the headnode, and the third command verifies it’s there. With the image in place on the compute node, it’s time to retry the migration:

[root@smartos ~]# vmadm send 55d81a4a-bac4-46b5-aee6-6894f103805c | ssh 192.168.5.15 vmadm receive
Password:
Successfully sent VM 55d81a4a-bac4-46b5-aee6-6894f103805c
Successfully received VM 55d81a4a-bac4-46b5-aee6-6894f103805c
[root@sdc-cn1 (homelab) ~]# vmadm list | 55d81a4a-bac4-46b5-aee6-6894f103805c
55d81a4a-bac4-46b5-aee6-6894f103805c  OS    2048     stopped           mysqldb

So far so good! Now we need to confirm that VMAPI is aware of it and that we can properly start, stop, and log into it. Best practice dictates that you not call the APIs directly, but I’m showing an example of querying VMAPI directly and then using the appropriate wrapper to manage the instance.

[root@headnode (homelab) ~]# sdc-vmapi /vms/55d81a4a-bac4-46b5-aee6-6894f103805c | json -Ha zonename uuid zone_state state
 55d81a4a-bac4-46b5-aee6-6894f103805c installed stopped
[root@headnode (homelab) ~]# sdc-vmadm list | grep 55d81a4a-bac4-46b5-aee6-6894f103805c
55d81a4a-bac4-46b5-aee6-6894f103805c  joyent          2048  stopped  mysqldb
[root@headnode (homelab) ~]# sdc-vmadm start 55d81a4a-bac4-46b5-aee6-6894f103805c
Start job b5c1b6a0-d097-4571-a6d4-4b5ebfa86764 for VM 55d81a4a-bac4-46b5-aee6-6894f103805c created

What this did was submit a job to the workflow API which coordinates with the other APIs to find the right compute node and issue the commands to start the instance. After a brief moment, we check again:

[root@headnode (homelab) ~]# sdc-vmadm list | grep 55d81a4a-bac4-46b5-aee6-6894f103805c
55d81a4a-bac4-46b5-aee6-6894f103805c  joyent          2048  stopped  mysqldb
[root@sdc-cn1 (homelab) ~]# vmadm list | 55d81a4a-bac4-46b5-aee6-6894f103805c
55d81a4a-bac4-46b5-aee6-6894f103805c  OS    2048     running           mysqldb

So initially from VMAPI’s perspective, the instance is not running, but does show running on the compute node. A trick to getting this information to sync up if it doesn’t on its own after a bit:

[root@headnode (homelab) ~]# sdc-vmapi /vms/55d81a4a-bac4-46b5-aee6-6894f103805c?sync=true
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 1037
Content-MD5: uDgxMvNnnSR/tZa0JS1w9A==
Date: Sun, 09 Nov 2014 02:24:51 GMT
Server: VMAPI
x-request-id: 94a38ce0-67b7-11e4-85b7-6118d1934f05
x-response-time: 1348
x-server-name: 18fb541b-e89f-4082-b001-8ed1306240c0
Connection: keep-alive
< ...snip...>
[root@headnode (homelab) ~]# sdc-vmadm list | grep 55d81a4a-bac4-46b5-aee6-6894f103805c
55d81a4a-bac4-46b5-aee6-6894f103805c  joyent          2048  running  mysqldb

Now we test shutting down:

[root@headnode (homelab) ~]# sdc-vmadm stop 55d81a4a-bac4-46b5-aee6-6894f103805c
Stop job 4d67901a-07c7-4da3-a97b-3b3758fb3f0e for VM 55d81a4a-bac4-46b5-aee6-6894f103805c created
[root@headnode (homelab) ~]# sdc-vmadm list | grep 55d81a4a-bac4-46b5-aee6-6894f103805c
55d81a4a-bac4-46b5-aee6-6894f103805c  OS    2048     running           mysqldb
[root@headnode (homelab) ~]# sdc-vmadm list | grep 55d81a4a-bac4-46b5-aee6-6894f103805c
55d81a4a-bac4-46b5-aee6-6894f103805c  OS    2048     running           mysqldb
[root@headnode (homelab) ~]# sdc-vmadm list | grep 55d81a4a-bac4-46b5-aee6-6894f103805c
55d81a4a-bac4-46b5-aee6-6894f103805c  OS    2048     shutting_down     mysqldb
[root@headnode (homelab) ~]# sdc-vmadm list | grep 55d81a4a-bac4-46b5-aee6-6894f103805c
55d81a4a-bac4-46b5-aee6-6894f103805c  OS    2048     stopped           mysqldb

Finally, we restart and test logging in:

[root@headnode (homelab) ~]# sdc-vmadm start 55d81a4a-bac4-46b5-aee6-6894f103805c
Start job 763f4c13-37bf-4837-8e5d-4961f369b75f for VM 55d81a4a-bac4-46b5-aee6-6894f103805c created
[root@headnode (homelab) ~]# sdc-vmadm list | grep 55d81a4a-bac4-46b5-aee6-6894f103805c
55d81a4a-bac4-46b5-aee6-6894f103805c  OS    2048     running           mysqldb
[root@sdc-cn1 (homelab) ~]# zlogin 55d81a4a-bac4-46b5-aee6-6894f103805c
zlogin: zone '55d81a4a-bac4-46b5-aee6-6894f103805c' unknown

Uh oh, so what has happened here? On my SmartOS system, I created my OS zones with json files that specified both “zonename” and “alias” as the alias name. Creating a zone in this manner will cause the dataset to be named as the alias, mounted under /zones/ instead of /zones/. This was kept intact during the vmadm send/recv above. If your zone datasets are all UUIDs then you should be able to avoid the following, but it’s something to watch out for. So what we’ll need to do now is some [dangerous] house cleaning to fix this:

[root@headnode (homelab) ~]# sdc-vmadm stop 55d81a4a-bac4-46b5-aee6-6894f103805c
Stop job 1881a318-db49-4430-872f-03c1b3f551d1 for VM 55d81a4a-bac4-46b5-aee6-6894f103805c created
[root@headnode (homelab) ~]# sdc-vmadm list | grep 55d81a4a-bac4-46b5-aee6-6894f103805c
55d81a4a-bac4-46b5-aee6-6894f103805c  OS    2048     stopped           mysqldb
[root@sdc-cn1 (homelab) ~]# grep 55d81a4a-bac4-46b5-aee6-6894f103805c /etc/zones/index
mysqldb:installed:/zones/mysqldb:55d81a4a-bac4-46b5-aee6-6894f103805c
[root@sdc-cn1 (homelab) ~]# zfs list | grep zones/mysqldb
zones/mysqldb                                                  626M  9.39G   626M  /zones/mysqldb
[root@sdc-cn1 (homelab) ~]# zfs rename zones/mysqldb zones/55d81a4a-bac4-46b5-aee6-6894f103805c
[root@sdc-cn1 (homelab) ~]# zfs list | grep 55d81a4a-bac4-46b5-aee6-6894f103805c
zones/55d81a4a-bac4-46b5-aee6-6894f103805c                626M  9.39G   626M  /zones/55d81a4a-bac4-46b5-aee6-6894f103805c
[root@sdc-cn1 (homelab) ~]# zfs create zones/55d81a4a-bac4-46b5-aee6-6894f103805c/cores
[root@sdc-cn1 (homelab) ~]# vi /etc/zones/index
(change instances of mysqldb with 55d81a4a-bac4-46b5-aee6-6894f103805c and save)
[root@sdc-cn1 (homelab) ~]# mv /etc/zones/mysqldb.xml 55d81a4a-bac4-46b5-aee6-6894f103805c.xml
[root@sdc-cn1 (homelab) ~]# vi /etc/zones/55d81a4a-bac4-46b5-aee6-6894f103805c.xml
(change zonepath="/zones/mysqldb" to zonepath="/zones/55d81a4a-bac4-46b5-aee6-6894f103805c" and save)

The instance was stopped, the dataset was renamed, the missing cores dataset was created since vmadm send/recv doesn’t create it automatically, and the zone index files were updated appropriately. At this point, we can see if all the effort has paid off:

[root@headnode (homelab) ~]# sdc-vmapi /vms/55d81a4a-bac4-46b5-aee6-6894f103805c?sync=true
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 1060
Content-MD5: RRjE3jI0X9qrwKYOkZrEpA==
Date: Sun, 09 Nov 2014 03:03:33 GMT
Server: VMAPI
x-request-id: fc9cbb00-67bc-11e4-85b7-6118d1934f05
x-response-time: 1356
x-server-name: 18fb541b-e89f-4082-b001-8ed1306240c0
Connection: keep-alive
< ...snip...>
[root@headnode (homelab) ~]# sdc-vmadm start 55d81a4a-bac4-46b5-aee6-6894f103805c
Start job 60af5bda-8f53-4d81-9099-cc790a8443ba for VM 55d81a4a-bac4-46b5-aee6-6894f103805c created
[root@headnode (homelab) ~]# sdc-vmadm list | grep -i 55d81a4a-bac4-46b5-aee6-6894f103805c
55d81a4a-bac4-46b5-aee6-6894f103805c  joyent          2048  running  mysqldb
[root@sdc-cn1 (homelab) ~]# zlogin 55d81a4a-bac4-46b5-aee6-6894f103805c
[Connected to zone '55d81a4a-bac4-46b5-aee6-6894f103805c' pts/2]
Last login: Sat Nov  8 23:42:54 on pts/3
   __        .                   .
 _|  |_      | .-. .  . .-. :--. |-
|_    _|     ;|   ||  |(.-' |  | |
  |__|   `--'  `-' `;-| `-' '  ' `-'
                   /  ; Instance (base64 13.4.2)
                   `-'  http://wiki.joyent.com/jpc2/SmartMachine+Base</p>
[root@mysqldb ~]# exit
logout
[Connection to zone '55d81a4a-bac4-46b5-aee6-6894f103805c' pts/2 closed]
[root@sdc-cn1 (homelab) ~]#

As you can see this has been a non-trivial process just for one zone, and I have yet to test it with KVM instances. Now that SDC is open source, chances are good that some enterprising individual in the community may end up writing a script to help simplify this process. I hope this information proves useful to those of you experimenting and learning SDC, but again I have to warn never to attempt this for anything considered production or on any system with data you care about.

I too am experimenting, and I hope my next post will be able to show a successful conversion of SmartOS to SDC. That is a far more risky proposition and will require a good deal of trial and error. In the meantime, enjoy SDC!