PDF.co API for Word to PDF Conversion and Parsing

PDF.co API for Word to PDF Conversion and Parsing

Hello,

I am posting this topic here because the developer community has been so helpful in the past. We have been attempting to work with PDF.co to get an answer on some issues we're having, but there must be a time zone delay, and there's just been a lot of back and forth, rather than an actual resolution. Here is what we're attempting to do:
  1. Download a .docx file from a custom module's attachments
  2. Obtain a pre-signed URL from PDF.co (they're using AWS) to work with the file
    1. URL response - url to access uploaded file
    2. Presigned URL - used to upload local file
  3. Upload the .docx file to AWS with the pre-signed URL
  4. Convert the .docx file to a PDF using the PDF.co endpoint and the URL obtained from the pre-signed URL upload
  5. Parse the PDF and return as JSON to our deluge custom function
  6. We then iterate through it and do the things we need to do
A couple more notes before we get into the code:
  1. We tried to convert the .docx file to a PDF using the Zoho Writer API and found that cell borders disappeared causing issues with the parsing at PDF.co. BUT, everything else worked great, including uploading the PDF to the pre-signed URL.
  2. We attempted to upload the file to PDF.co's built in storage, rather than the pre-signed URL, but when we download the file, and then attempt to pass it as a parameter, we get an error saying the file parameter is missing. We believe this is because the upload endpoint wants the file path. We don't know how to get that or if we even can get that from Zoho. 
  3. The pre-signed URL endpoint worked for us when we uploaded the Zoho converted PDF file.
Okay, on to the code. Thanks in advance for your help and grace with this far from perfect code! I have omitted the parsing code since we're getting hung up before then.
  1. thisRec = zoho.crm.getRecordById("Meeting_Details",thisRecId);
  2. accountId = thisRec.get("Account").get("id");
  3. newList = List();
  4. //Get all attachments on meeting detail rec
  5. response = invokeurl
  6. [
  7. url :"https://www.zohoapis.com/crm/v6/Meeting_Details/" + thisRecId + "/Attachments?fields=id,Owner,File_Name,Created_Time,Parent_Id&sort_order=desc&sort_by=Created_Time"
  8. type :GET
  9. connection:"******"
  10. ];
  11. // info response;
  12. responseNull = response.isNull();
  13. info "Attachment Response Empty: " + responseNull;
  14. if(responseNull == true)
  15. {
  16. info "No Attachments to parse";
  17. return;
  18. }
  19. //Get last uploaded file
  20. fileRec = response.get("data").get(0);
  21. info "File Rec: " + fileRec;
  22. fileId = fileRec.get("id");
  23. info "File ID: " + fileId;
  24. fileName = fileRec.get("File_Name");
  25. info "File Name: " + fileName;
  26. createdTime = fileRec.get("Created_Time");
  27. info "Created Time: " + createdTime;
  28. //Download File from Meeting Detail Rec
  29. downloadResponse = invokeurl
  30. [
  31. url :"https://www.zohoapis.com/crm/v6/Meeting_Details/" + thisRecId + "/Attachments/" + fileId
  32. type :GET
  33. connection:"******"
  34. ];
  35. info downloadResponse;
  36. fileCheck = downloadResponse.isFile();
  37. info "File? " + fileCheck;
  38. //
  39. //PDF.co Headers
  40. apiKey = "*******************";
  41. headers = Map();
  42. headers = {"x-api-key":apiKey};
  43. // Upload Permission
  44. setupFileUpload = invokeurl
  45. [
  46. url :"https://api.pdf.co/v1/file/upload/get-presigned-url?name=" + fileName + "&encrypt=false"
  47. type :GET
  48. headers:headers
  49. detailed:true
  50. ];
  51. setupFileUpload = setupFileUpload.get("responseText");
  52. // info "File Upload Response: " + setupFileUpload;
  53. uploadUrl = setupFileUpload.get("presignedUrl");
  54. info "Upload URL: " + uploadUrl;
  55. workingFileUrl = setupFileUpload.get("url");
  56. info "Working URL: " + workingFileUrl;
  57. // Actual Upload
  58. uploadHeaders = Map();
  59. uploadHeaders.put("x-api-key",apiKey);
  60. uploadHeaders.put("Content-Type","application/octet-stream");
  61. uploadParams = Map();
  62. uploadParams.put("file",downloadResponse);
  63. uploadParams.put("expiration",20);
  64. uploadParams.put("async","true");
  65. actualUpload = invokeurl
  66. [
  67. url :uploadUrl
  68. type :PUT
  69. parameters:uploadParams
  70. headers:uploadHeaders
  71. detailed:true
  72. ];
  73. info "Actual Upload: " + actualUpload;
  74. //
  75. //Convert to PDF
  76. conversionURL = "https://api.pdf.co/v1/pdf/convert/from/doc";
  77. conversionHeader = Map();
  78. conversionHeader.put("x-api-key",apiKey);
  79. conversionHeader.put("Content-Type","application/json");
  80. conversionPayload = {"url":workingFileUrl,"async":false,"inline":"true","password":"","profiles":""};
  81. conversionResult = invokeurl
  82. [
  83. url :conversionURL
  84. type :POST
  85. parameters:conversionPayload
  86. headers:conversionHeader
  87. detailed:true
  88. ];
  89. info "Conversion Result: " + conversionResult;
  90. convertedDocURL = conversionResult.get("responseText").get("url");
  91. info "Converted Doc URL: " + convertedDocURL;
We believe the issue is happening on the upload to the pre-signed URL. Once we upload the file 66-74, we then use the workingFileURL to look at the document. We are able to download, but have to use Word's recovery tool as it's corrupted. The code continues and it converts the file to a PDF, but it is unintelligible. 

Thanks so much for your help!