Original date: Feb 10, 2022
This is going to be a series of articles where I document my process of trying to get an existing Java command line utility to run in the browser, served by a Golang backend web server.
I'll say upfront that I realize there is probably a library (or three) out there that already does this. But I also know that I won't understand its inner workings without some mental effort and learning. The point of this series is that, since I will have to learn those inner workings, I might as well make it myself and really learn those inner workings.
Background and proposed architecture
Here's a brief background. While working on a math PhD, I wrote this program in Java called steenrod, to assist me with my thesis work. This work involved doing lots of calculations in the "Steenrod algebra" and many of its related friends. The details are not important for this series of articles; rather, the main takeaway is that steenrod is a command line Java program with a user loop, where the user inputs some mathematical expression and gets back another mathematical expression. A snippet of it looks like:
(3.037476 ms) [3, 4] x [1, 2] + [4, 2] x [] + [1, 16] x [3, 2] + [2, 8] x [2, 2] + [] x [4, 2]
(Total command time: 0.011054177 seconds)
In the above snippet, the "Enter a command" part is scanning for stdin, and the rest is dumped to stdout. This brings us to the goal of this little project: to be able to interact with this Java program from the browser, as if we are on the command line -- as if we are writing to stdin from the browser, and getting stdout back.
My guess is that the architecture of this project will look like (thanks draw.io):
Let's dig into this a little more. First, I'm pretty sure JavaScript will be required if I want it to actually look like a terminal in the browser, but I know little on the front-end, so I am putting that off until later. As for the back-end, the two approaches that come to mind are: (1) call the Java program using the Go exec package, or (2) put a ServerSocket in front of the current Java program, and then send the user's bytes from the Go server to the Java program via TCP (or HTTP, or whatever protocol -- I'm picking TCP because I want to learn more about it).
Either way, we need to send input to the Java program, and get output back. But there is a design decision to make. On the one hand, we can run the Java program as it already exists in the background as a Linux process, and then send data to it by writing to stdin. On the other hand, we can essentially write an API layer in front of the current Java program and use that for communication.
Roughly, these hands correspond to (1) and (2) above. To me, option (2) sounds more fun and more educational, so I'm going to do that. The downside is that the current user loop in steenrod will need to be re-written -- I'm not sure what to do about that yet. It expects stdin, but I will be sending TCP messages. I could write an API with each of the possible commands in steenrod, which would ignore the user loop altogether, but it would take a lot more time. For now, let's put that on the backburner and get our feet wet by playing with TCP communication, using ServerSocket and net.Dial.
TCP Basics
Okay, this is not really "TCP Basics" per se. I'm not going to go into what TCP is. Other than some ACKs and SYNs, I'm not even that sure what TCP is. This is more about the basics of sending and receiving data over a TCP connection.
In Java, we can use a ServerSocket to listen for incoming TCP connections. To regurgitate the documentation here: we will create a ServerSocket object, then use the accept() method to listen for a connection and accept it once it comes in (which creates the actual socket), noting that this method blocks until a connection is made. When we're done, we should call close(). We are not going to consider security at all, at least for now.
Here's a simple TCP server in Java which is stolen from here but which also writes the socket info to stdout. Note that because ServerSocket implements AutoCloseable, and we initialize it as a try-with-resources statement, we do not need to worry about closing the ServerSocket. Same thing applies to the Socket we get via accept().
1 import java.io.IOException; 2 import java.io.PrintWriter; 3 import java.net.ServerSocket; 4 import java.util.Date; 5 6 public class TCPServer { 7 public static void main(String[] args) throws IOException { 8 try (var listener = new ServerSocket(59000)) { 9 System.out.println("The server is running..."); 10 while(true) { 11 try (var socket = listener.accept()) { 12 var out = new PrintWriter(socket.getOutputStream(), true); 13 out.println((new Date()).toString()); 14 out.println(socket.toString()); 15 } 16 } 17 } 18 } 19}
Now let's write the simplest TCP client in Go. (I know I got the basic blueprint of this from somewhere on the internet, but I forget where. Sorry...)
1 package main 2 3 import ( 4 "fmt" 5 "net" 6 "os" 7 ) 8 9 func main() { 10 arguments := os.Args 11 if len(arguments) != 2 { 12 fmt.Println("Format is host:port") 13 return 14 } 15 16 address := arguments[1] 17 c, err := net.Dial("tcp", address) 18 if err != nil { 19 fmt.Println(err) 20 return 21 } 22 defer c.Close() 23 24 buf := make([]byte, 256) 25 26 n, err := c.Read(buf) 27 fmt.Printf("%d bytes read\n", n) 28 fmt.Println(string(buf)) 29 }
Explanation: we pass this program an address of the form "host:port", then get back a Conn from net.Dial("tcp",address), then call the Read() on that Conn, which fills our []byte buffer with the response from the TCP connection.
Running the Java server in the background, and then running
shows the output:
80 bytes read Wed Feb 09 01:39:58 UTC 2022 Socket[addr=/127.0.0.1,port=52828,localport=59000]
If our buffer was smaller, such as 64 bytes, we would only get the first 64 bytes back; our response would be truncated. The port 52828 is an ephemeral port that I'm assuming the OS (Linux in this case) chose based on some criteria, such as "it was available".
For fun and to see if I could break something, I ran this 10,000 times, with a 10ms sleep between each new Dial(), and you can see Linux rotate through the ephemeral port range, eventually having to rollover:
80 bytes read Wed Feb 09 01:47:12 UTC 2022 Socket[addr=/127.0.0.1,port=60996,localport=59000] 80 bytes read Wed Feb 09 01:47:12 UTC 2022 Socket[addr=/127.0.0.1,port=60998,localport=59000] 80 bytes read Wed Feb 09 01:47:12 UTC 2022 Socket[addr=/127.0.0.1,port=32768,localport=59000] 80 bytes read Wed Feb 09 01:47:12 UTC 2022 Socket[addr=/127.0.0.1,port=32770,localport=59000]
However, something else happens which is mysterious to me. Intermittently, there is an output which only includes the date, and not the socket.toString() portion. Notice that the port jump (jumping 2 at a time) still appears to have happened for the smaller response:
80 bytes read Wed Feb 09 01:46:27 UTC 2022 Socket[addr=/127.0.0.1,port=52868,localport=59000] 80 bytes read Wed Feb 09 01:46:27 UTC 2022 Socket[addr=/127.0.0.1,port=52870,localport=59000] 29 bytes read Wed Feb 09 01:46:27 UTC 2022 80 bytes read Wed Feb 09 01:46:28 UTC 2022 Socket[addr=/127.0.0.1,port=52874,localport=59000]
I'm not sure why this happens, but out of 10,000 runs, there are in the range of a few hundred of these. Perhaps something to do with the Java server being single-threaded? Buffer-related? Why do we see the Date() info successfully? Spoiler: I find the answer below.
Writing and reading to and from the TCP channel
In the previous example, we only read from a TCP server. Now we'll send a message to the server and get something back. Again, I'm going to base the Java server off of this webpage, with some changes, such as the omission of multiple threads. The server looks like:
1 import java.io.IOException; 2 import java.io.PrintWriter; 3 import java.net.Socket; 4 import java.net.ServerSocket; 5 import java.util.Scanner; 6 7 public class TCPServer { 8 public static void main(String[] args) throws IOException { 9 try(var listener = new ServerSocket(59000)) { 10 System.out.println("The server is running..."); 11 while(true) { 12 try(var socket = listener.accept()) { 13 echo(socket); 14 } 15 } 16 } 17 } 18 19 public static void echo(Socket socket) { 20 try { 21 var in = new Scanner(socket.getInputStream()); 22 var out = new PrintWriter(socket.getOutputStream(), true); 23 out.println("Socket info: " + socket.toString()); 24 while(in.hasNextLine()) { 25 out.println("Received input: " + in.nextLine()); 26 } 27 } catch(Exception e) { 28 e.printStackTrace(); 29 } 30 } 31 }
Explanation: this server will scan input data, and then use that to write back to the other side of the connection, saying that the input was received. Note that we really should close the Scanner and PrintWriter classes in a finally block, but I've not done that here for brevity (read: laziness).
Now, we'll expand our TCP client in Go so that it actually sends a message to the server:
1 package main 2 3 import ( 4 "fmt" 5 "net" 6 "os" 7 ) 8 9 func main() { 10 arguments := os.Args 11 if len(arguments) != 2 { 12 fmt.Println("Format is host:port") 13 return 14 } 15 16 address := arguments[1] 17 callTCP(address) 18 } 19 20 func callTCP(address string) { 21 c, err := net.Dial("tcp", address) 22 if err != nil { 23 fmt.Println(err) 24 return 25 } 26 defer c.Close() 27 28 msg := []byte("Hello!\n") 29 _, err = c.Write(msg) 30 if err != nil { 31 fmt.Println("TCP write error") 32 return 33 } 34 35 buf := make([]byte, 256) 36 37 n, err := c.Read(buf) 38 if err != nil { 39 fmt.Println("TCP read error") 40 return 41 } 42 fmt.Printf("%d bytes read\n", n) 43 fmt.Println(string(buf)) 44 }
This is a pretty self-explanatory extension of the previous program. We added a section for filling a new byte slice, in order to write the message "Hello!". The only subtlety is that we need to include a new line escape character because the Java server is checking for hasNextLine().
Now let's run it against the server:
64 bytes read Socket info: Socket[addr=/127.0.0.1,port=45056,localport=59000]
Wait, where's the acknowledgement of a message? Running it again:
87 bytes read Socket info: Socket[addr=/127.0.0.1,port=45058,localport=59000] Received input: Hello!
This is similar behavior to what we've seen before. Why are we not receiving the entire response? Well, what's a similarity here? Notice in both cases that the first line we write to the PrintWriter is received by the Go client. With our earlier example, I thought maybe there was an issue with the Socket.toString() method -- perhaps it was taking too long -- but that's the one that's working every time now. Ah! It's probably related to the buffer. It might be filling with the first line, and giving us that as the response on the Go side.
Okay, let's add a 10ms sleep and then repeat the Read() to see if we can get the rest of what's being "offered" by the Java server:
46 time.Sleep(10 * time.Millisecond) 47 n, err = c.Read(buf) 48 if err != nil { 49 fmt.Println("TCP read error") 50 return 51 } 52 fmt.Printf("%d bytes read\n", n) 53 fmt.Println(string(buf))
Now re-run and we get:
64 bytes read Socket info: Socket[addr=/127.0.0.1,port=45084,localport=59000] 23 bytes read Received input: Hello! r=/127.0.0.1,port=45084,localport=59000]
Nice! That solves it. Yeah, there's some of the leftover data in the buffer byte slice from the previous buffer, but obviously we would clear that out in real code -- we're just playing around here! And that's also why I'm including all of this, because this is an important part of programming, the debugging and playing part, and a lot of the time we just see the final versions of things online, where the author hasn't documented the struggles they went through. Of course there is a balance to be struck there.
So why do we sometimes get both output lines from the Java program's output stream, and sometimes we need to run Read() twice? I'm not really sure. There's probably something going on here with how the TCP protocol works at a lower level, something to do with the timing of waiting on a TCP channel for a message? It's interesting and worth some digging.
Next time
Anyway, this is a good stopping point for now. Next on the agenda will be to send mathematical expressions to the TCP server and actually parse them, do arithmetic, and send a result back to the client.
I have a couple of other questions in my head right now for the future:
- How much input validation should be handled in the Go server, versus letting steenrod handle it? There is some validation in steenrod but it's still fairly brittle.
- What to do when (not if) the Java program crashes? As in the above point, there are parts that are brittle, where formatting is not handled correctly. A user could crash the program purposefully if they wanted. Then the only recourse would be to restart the Java program. Does the Go server have permission to do that? We'd have to make sure that the OS permissions are aligned, so we would probably start the Java program from the Go server in the first place, so that the Java process is always owned by the user running the server.
A partial remedy to both of the above is to strengthen the Java program and make it more resilient -- maybe even some nasty giant try-catch blocks.
Until next time.