Advertisement
rccharles

ASC Adjust clipboard data

May 3rd, 2019
890
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. (*
  2.   This applescript converts clipboard input into a format suited for pasting into an ASC
  3.   reply.  I observed that my copies into an ASC reply were not formated that well.  
  4.   I observed that copies from a web browser were formated much better.  I went about
  5.    adjusting the clipboard copy to the format expected by a web browser for best results.
  6.  
  7.  This applescript accepts the clipboard in either
  8.  -- plan text upon which the text is converted to HTML.  Conversion is limitted to inserting paragraph tags for blank lines and inserting links where http or https text appears. The page title is substituted for the link.  
  9.  -- HTML source code identified by text containing HTML markup.  
  10.          Caveat emptor.  
  11.  
  12.  to use:
  13.  1) copy command + c what data you want to convert
  14.  2) run this applascript by double clicking on the app.
  15.  3) paste command + V into an ASC reply
  16.  
  17.  I have tested in Waterfox 56.2.9 in Yosemite.  I assume the process will work with other web browsers and other versions of macOS.
  18.  
  19.  Save as an Application Bundle.  Don't check any of the boxes.
  20.  
  21. Should you experience a problem, run in the Script Editor.
  22.    Shows how to debug via on run path. Shows items added to folder. Shows log statement.
  23.    It is easier to diagnose problems with debug information. I suggest adding log statements to your script to see what is going on.  Here is an example.  
  24.    
  25.   For testing, run in the Script Editor.
  26.          1) Click on the Event Log tab to see the output from the log statement
  27.       2) Click on Run
  28.    
  29. change log
  30. may 1, 2019 -- skip 403 forbidding title
  31. may 2, 2019 -- convert \" to ".  the \" mysteriously appears in HTML source code input.  Probably some TextEdit artifact.
  32.               copy to TextEdit copy out of TextEdit.          
  33.  
  34. enhancements:
  35.   -- get pdf title
  36.  
  37.  
  38. Author: rccharles
  39.  
  40.  Copyright 2019 rccharles  
  41.      
  42.        Permission is hereby granted, free of charge, to any person obtaining a copy  
  43.        of this software and associated documentation files (the "Software"), to deal  
  44.        in the Software without restriction, including without limitation the rights  
  45.        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell  
  46.        copies of the Software, and to permit persons to whom the Software is  
  47.        furnished to do so, subject to the following conditions:  
  48.        
  49.        The above copyright notice and this permission notice shall be included in all  
  50.        copies or substantial portions of the Software.  
  51.        
  52.        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR  
  53.        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,  
  54.        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE  
  55.        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER  
  56.        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,  
  57.        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE  
  58.        SOFTWARE.  
  59.  
  60.  
  61.     example text document:
  62. set the clipboard to "\"Effective defenses 111 threats\" by John Galt
  63. https://discussions.apple.com/docs/DOC-8841
  64. \"Avoid phishing emails 222 and other scams\"
  65.  
  66. https://support.apple.com/en-ca/HT204759
  67.  
  68.  
  69.  
  70. blank lines
  71. also,see:http://www.google.com/ seeing again:http://www.google.com"
  72.  
  73.  *)
  74.  
  75.  
  76. -- Gets invoked here when you run in AppleScript editor or double click on the app icon.
  77. on run
  78.     global debug
  79.     set debug to 2
  80.    
  81.     set theList to clipboard info
  82.     printClipboardInfo(theList)
  83.    
  84.     set cbInfo to get (clipboard info) as string
  85.    
  86.     -- Most likely, if we have HTML data in the clipboard it will be from a web browser or Word.
  87.     if cbInfo contains "HTML" then
  88.         try
  89.             log "Working wih HTML Class data from clipboard."
  90.             set theBoard to the clipboard as «class HTML»
  91.             log "Print out the HTML data on the clipboard"
  92.             log theBoard
  93.            
  94.             set normalHtml to do shell script "osascript -e 'try' -e 'get the clipboard as «class HTML»' -e 'end try' | awk '{sub(/«data HTML/, \"\") sub(/»/, \"\")} {print}' | xxd -r -p "
  95.             log normalHtml
  96.             adjustBrowserHTML(normalHtml)
  97.             set returnedData to theBoard
  98.         on error errStr number errorNumber
  99.             log "===> We didn't find HTML data.   errStr is " & errStr & " errorNumber is " & errorNumber
  100.             return
  101.         end try
  102.     else
  103.         -- will work with a plan text.
  104.         try
  105.             log "Working with plan text"
  106.             set clipboardData to (the clipboard as text)
  107.             if debug ≥ 2 then
  108.                 log "class clipboardData is " & class of clipboardData
  109.                 log "calling printHeader."
  110.             end if
  111.             printHeader("clipboardData", clipboardData)
  112.         on error errStr number errorNumber
  113.             log "===> We didn't find data on the clipboard.   errStr is " & errStr & " errorNumber is " & errorNumber
  114.             display dialog "We didn't find HTML source code nor plan text on the clipboard." & return & "Please copy from a different source." giving up after 15
  115.             return 1
  116.         end try
  117.        
  118.         set returnedData to common(clipboardData)
  119.     end if
  120.     postToCLipboard(returnedData)
  121.     -- return code
  122.     return 0
  123.    
  124.    
  125. end run
  126.  
  127. -- Folder actions.
  128. -- Gets invoked here when something is dropped on the folder that this script is monitoring.
  129. -- Right click on the folder to be monitored. services > Folder Action Settup...
  130. on adding folder items to this_folder after receiving added_items
  131.     -- TBD
  132.    
  133. end adding folder items to
  134.  
  135.  
  136.  
  137. -- Gets invoked here when something is dropped on this AppleScript icon
  138. on open dropped_items
  139.    
  140.     global debug
  141.     set debug to 2
  142.     (*
  143.     -- Debug code.
  144.       set fileName to choose file with prompt "get file"
  145.       set dropped_items to {fileName}
  146.     *)
  147.     log "class of dropped_items is " & class of dropped_items
  148.     display dialog "You dropped " & (count of dropped_items) & " item or items." & return & "  Caveat emptor. You have been warned." giving up after 6
  149.    
  150.     set totalFileData to ""
  151.     repeat with droppedItem in dropped_items
  152.         log "The droppedItem is "
  153.         -- display dialog "processing file " & (droppedItem as string) giving up after 3
  154.         log droppedItem
  155.         log "class = " & class of droppedItem
  156.         set extIs to findExtension(droppedItem)
  157.         set extIsU to makeCaseUpper(extIs)
  158.         if extIsU is "HTML" or extIsU is "HTM" or extIsU is "TEXT" or extIsU is "TXT" then
  159.             try
  160.                 set theFile to droppedItem as string
  161.                 set theFile to open for access file theFile
  162.                 set allOfFile to read theFile
  163.                 close access theFile
  164.                 printHeader("read from file ( allOfFile )", allOfFile)
  165.                 set totalFileData to totalFileData & common(allOfFile)
  166.             on error theErrorMessage number theErrorNumber
  167.                 log theErrorMessage & "error number " & theErrorNumber
  168.                 close access theFile
  169.             end try
  170.            
  171.         else
  172.             -- we do not support this extension
  173.             display dialog "We only support files with extenstion of html, htm, text or txt in either case. Your file had a " & extIs & " extention. Skipping" giving up after 10
  174.            
  175.         end if
  176.     end repeat
  177.    
  178.     postToCLipboard(totalFileData)
  179.     -- return code
  180.     return 0
  181.    
  182. end open
  183.  
  184.  
  185. -- ------------------------------------------------------
  186. on common(clipboardData)
  187.     global debug
  188.     set lf to character id 10
  189.     -- Write a message into the event log.
  190.     log "  --- Starting on " & ((current date) as string) & " --- "
  191.     set cbInfo to get (clipboard info) as string
  192.    
  193.    
  194.     -- don't let Windoze confuse us. convert Return LineFeed to lf
  195.     set clipboardData to alterString(clipboardData, return & lf, lf)
  196.     -- might as will convert classic macOS return to lf. We will have to look for less things.
  197.     set clipboardData to alterString(clipboardData, return, lf)
  198.    
  199.     -- figure out what type of data we have: plan text or html source code text.
  200.     set paraCount to count of textToList(clipboardData, "<p")
  201.     set endparaCount to count of textToList(clipboardData, "</p>")
  202.     set titleCount to count of textToList(clipboardData, "<title")
  203.     set endTitleCount to count of textToList(clipboardData, "</title>")
  204.     set aLinkCount to count of textToList(clipboardData, "href=\"http")
  205.     -- mangled href="http
  206.     set mangledLinkCount to count of textToList(clipboardData, "href=\\\"http")
  207.     set brCount to count of textToList(clipboardData, "<br>")
  208.     if debug ≥ 1 then
  209.         log "Values used to distinguis HTML source code from plan text."
  210.         log "paraCount  is " & paraCount
  211.         log "endparaCount is " & endparaCount
  212.         log "titleCount is " & titleCount
  213.         log "endTitleCount is " & endTitleCount
  214.         log "aLinkCount is " & aLinkCount
  215.         log "brCount is " & brCount
  216.         log "mangledLinkCount is " & mangledLinkCount
  217.     end if
  218.     --set endHttpCount to count of textToList(clipboardData, "http://")
  219.     --set endHttpsCount to count of textToList(clipboardData, "https://")
  220.     -- note, textToList returns a count one greater than the actual because item one is the data before the first found entry.
  221.     if paraCount ≥ 4 and endparaCount ≥ 3 or brCount ≥ 4 or ((titleCount is endTitleCount) and titleCount ≥ 2) or aLinkCount ≥ 3 or mangledLinkCount ≥ 3 then
  222.         log "... found HTML input ... in plan text format."
  223.         -- strange \" are appearing in input text.  Probably the result of using TextEdit along the way.
  224.         -- quick hack.
  225.         set alteredClipboardData to alterString(clipboardData, "\\\"", "\"")
  226.         set readyData to typeHTML(alteredClipboardData)
  227.     else
  228.         log "... found plan Text input ..."
  229.         set readyData to typeText(clipboardData)
  230.        
  231.     end if
  232.     return readyData
  233. end common
  234.  
  235. -- ------------------------------------------------------
  236. (*
  237. Free version of Parallels for individual use:</p><p><br></p>
  238. <p>https://itunes.apple.com/us/app/parallels-desktop-lite/id1085114709?mt=12</p><p><br></p>
  239. <p>    Full version</p><p><a href="http://www.parallels.com/en/products/desktop/" target="_blank">
  240.      http://www.parallels.com/en/products/desktop/</a>
  241.      
  242. If asc find a URL outside of an a tag, it will place blank lines around the URL. No, it will not go the
  243. full nine yards and place an a tag around the url.
  244.  
  245. *)
  246. on adjustBrowserHTML(normalHtml)
  247.    
  248.    
  249. end adjustBrowserHTML
  250. -- ------------------------------------------------------
  251. (*
  252. alterString
  253.   thisText is the input string to change
  254.   delim is what string to change.  It doesn't have to be a single character.
  255.   replacement is the new string
  256.  
  257.   returns the changed string.
  258. *)
  259.  
  260. on alterString(thisText, delim, replacement)
  261.     set resultList to {}
  262.     set {tid, my text item delimiters} to {my text item delimiters, delim}
  263.     try
  264.         set resultList to every text item of thisText
  265.         set text item delimiters to replacement
  266.         set resultString to resultList as string
  267.         set my text item delimiters to tid
  268.     on error
  269.         set my text item delimiters to tid
  270.     end try
  271.     return resultString
  272. end alterString
  273.  
  274. -- ------------------------------------------------------
  275. (*
  276.   Return the text to the right of theToken.
  277. *)
  278. on answerAndChomp(theString, theToken)
  279.     set debugging to false
  280.     set theOffset to offset of theToken in theString
  281.     if debugging then log "theOffset is " & theOffset
  282.     set theLength to length of theString
  283.     if theOffset > 0 then
  284.         set beginningPart to text 1 thru (theOffset - 1) of theString
  285.         if debugging then log "beginningPart is " & beginningPart
  286.        
  287.         set chompped to text theOffset thru theLength of theString
  288.         if debugging then log "chompped is " & chompped
  289.         return {chompped, beginningPart}
  290.     else
  291.         set beginningPart to ""
  292.         return {theString, beginningPart}
  293.     end if
  294.    
  295. end answerAndChomp
  296.  
  297. -- ------------------------------------------------------
  298. (*
  299.   Delete the leading part of the string until and including theToken.
  300. *)
  301. on chompLeftAndTag(theString, theToken)
  302.     set debugging to false
  303.     --log text 1 thru ((offset of "my" in s) - 1) of s
  304.     --set rightString to offset of theToken in theString thru count of theString of theString
  305.     set theOffset to offset of theToken in theString
  306.     if debugging then log "theOffset is " & theOffset
  307.     set theLength to length of theString
  308.     if debugging then log "theLength is " & theLength
  309.     if theOffset > 0 then
  310.         set chompped to text (theOffset + (length of theToken)) thru theLength of theString
  311.         if debugging then log "chompped is " & chompped
  312.         return chompped
  313.     else
  314.         return ""
  315.     end if
  316. end chompLeftAndTag
  317.  
  318. -- ------------------------------------------------------  
  319. (*
  320. Yvan Koenig
  321. https://macscripter.net/viewtopic.php?id=43133
  322. *)
  323. on findExtension(inputFileName)
  324.     set fileName to inputFileName as string
  325.     set saveTID to AppleScript's text item delimiters
  326.     set AppleScript's text item delimiters to {"."}
  327.     set theExt to last text item of fileName
  328.     set AppleScript's text item delimiters to saveTID
  329.     --log "theExt is " & theExt
  330.     if theExt ends with ":" then set theExt to text 1 thru -2 of theExt
  331.     --log "theExt is " & theExt
  332.     return theExt
  333. end findExtension
  334.  
  335. -- ------------------------------------------------------
  336. (*
  337.   http://krypted.com/mac-os-x/to-hex-and-back/
  338. *)
  339. on hexToString(hex)
  340.     log "in hexToString"
  341.     log "hex string is " & hex
  342.     set toUnix to "echo " & hex & " | xxd -r -p "
  343.     log "toUnix is " & toUnix
  344.     try
  345.         set fromUnix to do shell script toUnix
  346.         log "fromUnix is " & fromUnix
  347.     on error errMsg number n
  348.         log "convert hex string to string failed. " & errMsg & " with number " & n
  349.     end try
  350. end hexToString
  351.  
  352.  
  353. -- ------------------------------------------------------
  354. (*
  355.  
  356. https://stackoverflow.com/questions/55838252/minimum-value-that-not-zero
  357.        set m to get minimumPositiveNumber from {10, 2, 0, 2, 4}
  358.     log "m is " & m
  359.     set m to minimumPositiveNumber from {0, 0, 0}
  360.     log "m is " & m
  361.  
  362.  
  363. *)
  364. on minimumPositiveNumber from L
  365.     local L
  366.    
  367.     if L = {} then return null
  368.    
  369.     set |ξ| to 0
  370.    
  371.     repeat with x in L
  372.         set x to x's contents
  373.         if (x < |ξ| and x ≠ 0) ¬
  374.             or |ξ| = 0 then ¬
  375.             set |ξ| to x
  376.     end repeat
  377.    
  378.     |ξ|
  379. end minimumPositiveNumber
  380.  
  381. -- ------------------------------------------------------
  382. (*
  383.   makeCaseUpper("Now is the time, perhaps, for all good men")
  384. *)
  385. on makeCaseUpper(theString)
  386.     set UC to "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
  387.     set LC to "abcdefghijklmnopqrstuvwxyz"
  388.     set C to characters of theString
  389.     repeat with ch in C
  390.         if ch is in LC then set contents of ch to item (offset of ch in LC) of UC
  391.     end repeat
  392.     return C as string
  393. end makeCaseUpper
  394.  
  395. -- ------------------------------------------------------
  396. on postToCLipboard(pleasePost)
  397.     try
  398.         -- osascript -e "set the clipboard to «data HTML${hex}»"     
  399.         set toUnixSet to "osascript -e \"set the clipboard to «data HTML" & pleasePost & \""
  400.         log "toUnixSet is " & printHeader("toUnixSet", toUnixSet)
  401.        
  402.         set fromUnixSet to do shell script toUnixSet
  403.         log "fromUnixSet is " & fromUnixSet
  404.        
  405.     on error errMsg number n
  406.         log "==> We tried to send back HTML data, but failed. " & errMsg & " with number " & n
  407.     end try
  408.     -- see what ended up on the clipboard
  409.     set theList2 to clipboard info
  410.     printClipboardInfo(theList2)
  411. end postToCLipboard
  412.  
  413. -- ------------------------------------------------------
  414. on printClipboardInfo(theList)
  415.     log (clipboard info)
  416.     log class of theList
  417.     log "Data types on the clipboard ... "
  418.     printList("", theList)
  419.     log "... "
  420. end printClipboardInfo
  421.  
  422. -- ------------------------------------------------------
  423. (* Pump out the beginning of theString *)
  424. on printHeader(theName, theString)
  425.     global debug
  426.     if debug ≥ 3 then
  427.         log "in printHeader"
  428.         log theString
  429.         log length of theString
  430.     end if
  431.     if length of theString ≤ 0 then
  432.         log "==> no string to print"
  433.     else
  434.         log theName & " is " & text 1 thru (minimumPositiveNumber from {400, length of theString}) of theString & "<+++++++++"
  435.     end if
  436. end printHeader
  437.  
  438. -- ------------------------------------------------------
  439. (*
  440. print out the items in a list
  441.  
  442. *)
  443.  
  444. on printList(theName, splits)
  445.     try
  446.         set theCount to 1
  447.         repeat with theEntry in splits
  448.             --log "class of theEntry is " & class of theEntry
  449.             set classDisplay to class of theEntry as text
  450.             --log "classDisplay is " & classDisplay as text
  451.             --log "class of classDisplay is " & class of classDisplay
  452.             if classDisplay is "list" then
  453.                 log "    " & theName & theCount & " is " & item 1 of theEntry & "; " & item 2 of theEntry
  454.             else
  455.                 log "    " & theName & theCount & " is " & theEntry
  456.             end if
  457.             set theCount to theCount + 1
  458.         end repeat
  459.     on error errMsg number n
  460.         log "==> No go in printList. " & errMsg & " with number " & n
  461.     end try
  462. end printList
  463.  
  464. -- ------------------------------------------------------
  465. (*
  466. splitTextToList seems to be what you are trying to do
  467.   thisText is the input string
  468.   delim is what to split on
  469.  
  470.   results returned in a list
  471.  
  472.   Total hack. We know splitTextToList strips of delim so add it back.
  473. *)
  474.  
  475. on splitTextToList(thisText, delim)
  476.    
  477.     set returnedList to textToList(thisText, delim)
  478.     set resultArray to {}
  479.     copy item 1 of returnedList to the end of the resultArray
  480.    
  481.     repeat with i from 2 to (count of returnedList) in returnedList
  482.         set newElement to delim & item i of returnedList
  483.         copy newElement to the end of the resultArray
  484.     end repeat
  485.    
  486.     return resultArray
  487. end splitTextToList
  488.  
  489. -- ------------------------------------------------------
  490. (*
  491.   Retrieved data between "begin" and "end" tag. Whatever is between the strings.
  492. *)
  493. on tagContent(theString, startTag, endTag)
  494.     try
  495.         log "in tabContent. " & " startTag is " & startTag & " endTag is " & endTag
  496.         set beginningOfTag to chompLeftAndTag(theString, startTag)
  497.         if length of beginningOfTag ≤ 0 then
  498.             set middleText to ""
  499.         else
  500.             printHeader("beginningOfTag", beginningOfTag)
  501.             set endingOffset to (offset of endTag in beginningOfTag)
  502.             if endingOffset ≤ (length of endTag) then
  503.                 set middleText to ""
  504.             else
  505.                 set middleText to text 1 thru (endingOffset - 1) of beginningOfTag
  506.                 log "middleText is " & printHeader("middleText", middleText)
  507.             end if
  508.         end if
  509.     on error errMsg number n
  510.         log "finding contained text failed. " & errMsg & " with number " & n
  511.         set middleText to ""
  512.     end try
  513.     return middleText
  514. end tagContent
  515.  
  516. (*
  517. textToList seems to be what you are trying to do
  518.   thisText is the input string
  519.   delim is what to split on
  520.  
  521.   returns a list of strings.  
  522.  
  523. - textToList was found here:
  524. - http://macscripter.net/viewtopic.php?id=15423
  525.  
  526. *)
  527.  
  528. on textToList(thisText, delim)
  529.     set resultList to {}
  530.     set {tid, my text item delimiters} to {my text item delimiters, delim}
  531.    
  532.     try
  533.         set resultList to every text item of thisText
  534.         set my text item delimiters to tid
  535.     on error
  536.         set my text item delimiters to tid
  537.     end try
  538.     return resultList
  539. end textToList
  540.  
  541. -- ------------------------------------------------------
  542. on typeHTML(theData)
  543.     global debug
  544.     log "in typeHTML" & return & "  Try to send back HTML."
  545.     try
  546.         set clipboardDataQuoted to quoted form of theData
  547.         log "quoted form is " & printHeader("clipboardDataQuoted", clipboardDataQuoted)
  548.         -- make hex string as required for HTML data on the clipboard
  549.         set toUnix to "/bin/echo -n " & clipboardDataQuoted & " | hexdump -ve '1/1 \"%.2x\"'"
  550.         log "toUnix is " & printHeader("toUnix", toUnix)
  551.        
  552.         set fromUnix to do shell script toUnix
  553.        
  554.         log "fromUnix is " & printHeader("fromUnix", fromUnix)
  555.         if debug ≥ 2 then
  556.             log "displaying original string --- so we can tell if it converted successfully. "
  557.             hexToString(fromUnix)
  558.         end if
  559.     on error errMsg number n
  560.         log "==> convert to hex string failed. " & errMsg & " with number " & n
  561.         set fromUnix to ""
  562.     end try
  563.     return fromUnix
  564. end typeHTML
  565.  
  566. -- ------------------------------------------------------
  567. on typeText(theData)
  568.     (*
  569.          Unix-like systems      LF      0A      \n
  570.             (Linux, macOS)
  571.                Microsoft Windows    CRLF    0D 0A   \r\n
  572.                classic Mac OS       CR      0D          \r   Applescript return
  573.          *)
  574.     global debug
  575.     set lf to character id 10
  576.     log "in typeText"
  577.     printHeader("the input  ( theData )", theData)
  578.     -- Example: -- https://discussions.apple.com/docs/DOC-8841
  579.     -- locate links
  580.    
  581.     set theOutputBuffer to theData
  582.     set countOf to 1
  583.     -- file is mostly for testing, but should be ok for production too.
  584.     set linkId to {"https://", "http://"}
  585.     repeat with lookForLink in linkId
  586.        
  587.         set splitOnHTTPorHTTPS to splitTextToList(theOutputBuffer, lookForLink)
  588.         log "display splitOnHTTPorHTTPS.."
  589.        
  590.         -- debug info
  591.         if debug ≥ 2 then
  592.             repeat with theCurrentHTTPorHTTPS in splitOnHTTPorHTTPS
  593.                 printHeader("#" & countOf & " theCurrentHTTPorHTTPS ", theCurrentHTTPorHTTPS)
  594.                 set countOf to countOf + 1
  595.             end repeat
  596.         end if
  597.        
  598.         set buildHTML to beginning of splitOnHTTPorHTTPS
  599.         log "buildHTML is " & buildHTML
  600.         -- delete the first item text
  601.         set splitOnHTTPorHTTPS to rest of splitOnHTTPorHTTPS
  602.         log splitOnHTTPorHTTPS
  603.         set counti to 1
  604.         repeat with theCurrentHTTPorHTTPS in splitOnHTTPorHTTPS
  605.             -- example: converted url. no title found
  606.             -- <a href="https://discussions.apple.com/docs/DOC-8841" target="_blank">https://discussions.apple.com/docs/DOC-8841</a>       
  607.            
  608.             if debug ≥ 1 then
  609.                 set toUnix to "/bin/echo -n " & quoted form of theCurrentHTTPorHTTPS & " | hexdump -C"
  610.                 set fromUnix to do shell script toUnix
  611.                 log "fromUnix is " & return & fromUnix
  612.             end if
  613.            
  614.             -- find the end of the HTML URL by splitting on blank or return
  615.             -- unsafe characters includes the blank/empty space and " < > # % { } | \ ^ ~ [ ] `
  616.             -- https://perishablepress.com/stop-using-unsafe-characters-in-urls/
  617.             -- the end of the clipboard string my end after the url, hence no " ", LF or CR
  618.             -- Rember, CRLF was converted to LF above
  619.             set endsWhere to {}
  620.             copy (offset of " " in theCurrentHTTPorHTTPS) to the end of the endsWhere
  621.             copy (offset of lf in theCurrentHTTPorHTTPS) to the end of the endsWhere
  622.            
  623.             log endsWhere
  624.             set endOfURL to (minimumPositiveNumber from endsWhere) - 1
  625.            
  626.             if endOfURL = -1 then
  627.                 -- We have reached the end of the input
  628.                 set theURL to theCurrentHTTPorHTTPS
  629.             else
  630.                 set theURL to text 1 thru endOfURL of theCurrentHTTPorHTTPS
  631.             end if
  632.             log "--------------------------- " & theURL & "--------------------------- "
  633.             -- "curl --silent --location --max-time 15 " & theURL
  634.             set toUnix to "curl --silent --location --max-time 10 " & quoted form of theURL
  635.             log "toUnix  is " & toUnix
  636.             try
  637.                 log "reading link file to get title"
  638.                 set fromUnix to do shell script toUnix
  639.                 log "fromUnix"
  640.                 printHeader("fromUnix", fromUnix)
  641.                 -- may not be working with an HTLM document, so thefound title may be to long or confused.
  642.                 log "how far?"
  643.                 set actualTagData to tagContent(fromUnix, "<title", "</title>")
  644.                 log "actualTagData  is " & printHeader("actualTagData", actualTagData)
  645.                 if actualTagData is "" then
  646.                     set actualTagData to theURL
  647.                 else if length of actualTagData > 140 then
  648.                     log "length of actualTagData is " & length of actualTagData
  649.                     set actualTagData to theURL
  650.                     -- curl https://appleid.apple.com returns <title>403 Forbidden</title>
  651.                 else if actualTagData contains "403" and actualTagData contains "Forbidden" then
  652.                     set actualTagData to theURL
  653.                 else
  654.                     -- there could be some attributes within the tag
  655.                     -- an attribute could have a > in it. ignoring that for now.
  656.                     set actualTagData to text ((offset of ">" in actualTagData) + 1) thru (length of actualTagData) of actualTagData
  657.                     -- found line-end in title.  caused confustion.
  658.                     set actualTagData to alterString(actualTagData, return & lf, "  ")
  659.                     set actualTagData to alterString(actualTagData, return, " ")
  660.                     set actualTagData to alterString(actualTagData, lf, "  ")
  661.                 end if
  662.             on error errMsg number n
  663.                 log "==> Error occured when looking for title. " & errMsg & " with number " & n
  664.                 set actualTagData to theURL
  665.             end try
  666.             set assembled to "<a href=\"" & theURL & "\" target=\"_blank\">" & actualTagData & "</a>"
  667.             log "assembled  is " & assembled
  668.             if endOfURL = -1 then
  669.                 -- We have reached the end of the input
  670.                 set buildHTML to buildHTML & assembled
  671.             else
  672.                
  673.                 set buildHTML to buildHTML & assembled & text from (endOfURL + 1) to (length of theCurrentHTTPorHTTPS) of theCurrentHTTPorHTTPS
  674.             end if
  675.             -- wrap up
  676.             set theOutputBuffer to buildHTML
  677.             log "transformed text from buildHTML is  " & return & buildHTML
  678.             log "#" & counti & " transformed text from buildHTML is  " & return & buildHTML
  679.             -- number of links found
  680.             set counti to counti + 1
  681.            
  682.         end repeat -- looking for all links of the same type in document
  683.     end repeat -- scanning for https and http links
  684.    
  685.     (* add paragraphs *)
  686.    
  687.     -- start the theOutputBuffer with a paragraph tag.  We are taking a simple approach at this time.
  688.     set theOutputBuffer to "<p>" & theOutputBuffer
  689.     --  LF
  690.     -- Remember CRLF was changed to LF above and CR was chanaged to LF above.
  691.     -- we don't want no Windoze problems
  692.     set theOutputBuffer to alterString(theOutputBuffer, lf & lf, "</p><p> </p><p>")
  693.    
  694.     -- Does the string end with a dangling paragraph?  
  695.     if debug ≥ 3 then
  696.         log "length of theOutputBuffer is " & length of theOutputBuffer
  697.         log "((length of theOutputBuffer) - 2) is " & ((length of theOutputBuffer) - 2)
  698.         log "(length of theOutputBuffer)  is " & (length of theOutputBuffer)
  699.         log "((length of theOutputBuffer) - 3) is " & ((length of theOutputBuffer) - 3)
  700.     end if
  701.     if text ((length of theOutputBuffer) - 2) thru (length of theOutputBuffer) of theOutputBuffer is "<p>" then
  702.         set theOutputBuffer to text 1 thru ((length of theOutputBuffer) - 3) of theOutputBuffer
  703.     else if text ((length of theOutputBuffer) - 2) thru (length of theOutputBuffer) of theOutputBuffer is not "</p>" then
  704.         set theOutputBuffer to theOutputBuffer & "</p>"
  705.     end if
  706.    
  707.     log "theOutputBuffer is " & return & theOutputBuffer
  708.    
  709.     --convert to html clipboard format
  710.     return typeHTML(theOutputBuffer)
  711.    
  712. end typeText
  713.  
  714.  
  715.  
  716. (*
  717. https://www.oreilly.com/library/view/applescript-the-definitive/0596102119/re89.html
  718.  
  719. https://stackoverflow.com/questions/11085654/apple-script-how-can-i-copy-html-content-to-the-clipboard
  720.  
  721. -- user has copied a file's icon in the Finder
  722. clipboard info
  723. -- {{string, 20}, {«class ut16», 44}, {«class hfs », 80}, {«class
  724.  utf8», 20}, {Unicode text, 42}, {picture, 2616}, {«class icns», 43336},
  725. {«class furl», 62}}
  726.  
  727. textutil -convert html foo.rtf
  728.  
  729. if ((clipboard info) as string) contains "«class furl»" then
  730.         log "the clipboard contains a file named " & (the clipboard as string)
  731.     else
  732.         log "the clipboard does not contain a file"
  733.     end if
  734.    
  735. the clipboard       required
  736. as  class   optional
  737.  
  738. tell application "Script Editor"
  739.         activate
  740.     end tell
  741.    
  742. textutil has a simplistic text to html conversion
  743.     set clipboardDataQuoted to quoted form of theData
  744.     log "quoted form is " & clipboardDataQuoted
  745.    
  746.     set toUnix to "/bin/echo -n " & clipboardDataQuoted
  747.     set toUnix to toUnix & " | textutil -convert html -noload -nostore -stdin -stdout "
  748.     log "toUnix is " & toUnix
  749.     set fromUnix to do shell script toUnix
  750.     log "fromUnix  is " & fromUnix
  751.    
  752.    
  753. set s to "Today is my birthday"
  754. log text 1 thru ((offset of "my" in s) - 1) of s
  755. --> "Today is "
  756.             -- text 1 thru ((offset of "my" in s) - 1) of s
  757.             -- -1 since offset return the first character "m" position count
  758.            
  759. log "beginningOfTag is " & text 1 thru (minimumPositiveNumber from {200, length of beginningOfTag}) of beginningOfTag & "<+++++++++++++++++++++++"
  760.  
  761. https://developer.apple.com/library/archive/documentation/AppleScript/Conceptual/AppleScriptLangGuide/reference/ASLR_cmds.html
  762.  
  763. *)
  764.  
  765. --mac $ hex=`echo -n "<p>your html code here</>" | hexdump -ve '1/1 "%.2x"'`
  766. --mac $ echo $hex
  767. --3c703e796f75722068746d6c20636f646520686572653c2f3e
  768. --mac $ osascript -e "set the clipboard to «data HTML${hex}»"
  769. --mac $
  770. (*  
  771. A sub-routine for encoding ASCII characters.  
  772.  
  773. encode_char("$")  
  774. --> returns: "%24"  
  775.  
  776. based on:  
  777. https://www.macosxautomation.com/applescript/sbrt/sbrt-08.html  
  778.  
  779. *)
  780. (*
  781. Lowest Numeric Value in a List
  782.  
  783. This sub-routine will return the lowest numeric value in a list of items. The passed list can contain non-numeric data as well as lists within lists. For example:
  784.  
  785. lowest_number({-3.25, 23, 2345, "sid", 3, 67})
  786. --> returns: -3.25
  787. lowest_number({-3.25, 23, {-22, 78695, "bob"}, 2345, true, "sid", 3, 67})
  788. --> returns: -22
  789.  
  790. If there is no numeric data in the passed list, the sub-routine will return a null string ("")
  791.  
  792. lowest_number({"this", "list", "contains", "only", "text"})
  793. --> returns: ""
  794.  
  795. https://macosxautomation.com/applescript/sbrt/sbrt-03.html
  796.  
  797. Here's the sub-routine:
  798.  
  799. *)
  800. (*
  801. on lowestNumber(values_list)
  802.     set the low_amount to ""
  803.     repeat with i from 1 to the count of the values_list
  804.         set this_item to item i of the values_list
  805.         set the item_class to the class of this_item
  806.         if the item_class is in {integer, real} then
  807.             if the low_amount is "" then
  808.                 set the low_amount to this_item
  809.             else if this_item is less than the low_amount then
  810.                 set the low_amount to item i of the values_list
  811.             end if
  812.         else if the item_class is list then
  813.             set the low_value to lowest_number(this_item)
  814.             if the the low_value is less than the low_amount then ¬
  815.                 set the low_amount to the low_value
  816.         end if
  817.     end repeat
  818.     return the low_amount
  819. end lowestNumber
  820.  
  821. https://lists.apple.com/archives/applescript-users/2010/Sep/msg00139.html
  822. set list_of_values to {10, 20, 30, 40, 50, 60, 2000, 9, 3000, 4}
  823.  
  824. set minimum to 9.9999999999E+12
  825. set maximum to 0
  826. repeat with ref_to_value in list_of_values
  827.     set the_value to contents of ref_to_value
  828.     if the_value > maximum then set maximum to the_value
  829.     if the_value < minimum then set minimum to the_value
  830. end repeat
  831.  
  832. {minimum, maximum}
  833.  
  834. may do the trick.
  835.  
  836. Yvan KOENIG (VALLAURIS, France) lundi 13 septembre 2010 22:32:41
  837. *)
  838. (* https://lists.apple.com/archives/applescript-users/2010/Sep/msg00139.html
  839. set list_of_values to {10, 20, 30, 40, 50, 60, 2000, 9, 3000, 4}
  840.  
  841. set minimum to 9.9999999999E+12
  842.  
  843. assume it's limited to positive values
  844.  
  845.  
  846. on maxValue(list_of_values)
  847.     global debug
  848.     if debug ≥ 3 then log "in maxValue " & return & list_of_values
  849.     set maximum to 0
  850.     repeat with ref_to_value in list_of_values
  851.         set the_value to contents of ref_to_value
  852.         if the_value > maximum then set maximum to the_value
  853.     end repeat
  854.     if debug ≥ 3 then log maximum
  855.     return maximum
  856. end maxValue
  857. *)
  858. -- ------------------------------------------------------
  859. (*
  860. http://harvey.nu/applescript_url_encode_routine.html
  861.  
  862. on urlencode(theText)
  863.     set theTextEnc to ""
  864.     repeat with eachChar in characters of theText
  865.         set useChar to eachChar
  866.         set eachCharNum to ASCII number of eachChar
  867.         if eachCharNum = 32 then
  868.             set useChar to "+"
  869.         else if (eachCharNum ≠ 42) and (eachCharNum ≠ 95) and (eachCharNum < 45 or eachCharNum > 46) and (eachCharNum < 48 or eachCharNum > 57) and (eachCharNum < 65 or eachCharNum > 90) and (eachCharNum < 97 or eachCharNum > 122) then
  870.             set firstDig to round (eachCharNum / 16) rounding down
  871.             set secondDig to eachCharNum mod 16
  872.             if firstDig > 9 then
  873.                 set aNum to firstDig + 55
  874.                 set firstDig to ASCII character aNum
  875.             end if
  876.             if secondDig > 9 then
  877.                 set aNum to secondDig + 55
  878.                 set secondDig to ASCII character aNum
  879.             end if
  880.             set numHex to ("%" & (firstDig as string) & (secondDig as string)) as string
  881.             set useChar to numHex
  882.         end if
  883.         set theTextEnc to theTextEnc & useChar as string
  884.     end repeat
  885.     return theTextEnc
  886. end urlencode
  887.  
  888. Clipboard classes after a copy from the application.
  889. from waterfox
  890. (*«class HTML», 13876, «class utf8», 505, «class ut16», 1012, string, 505, Unicode text, 1010*)
  891.  
  892. from chrome
  893. (*«class HTML», 748, «class utf8», 204, «class ut16», 410, string, 204, Unicode text, 408*)
  894.  
  895. from safari
  896. (*«class weba», 120785, «class RTF », 70255, «class HTML», 122811, «class utf8», 3370, «class ut16», 6772, uniform styles, 47132, string, 3385, scrap styles, 8122, Unicode text, 6732, uniform styles, 47132, scrap styles, 8122*)
  897.  
  898. iCab
  899. (*«class weba», 1665, «class RTF », 763, «class utf8», 121, «class ut16», 244, uniform styles, 376, string, 121, scrap styles, 62, Unicode text, 242, uniform styles, 376, scrap styles, 62*)
  900.  
  901. Opera
  902. (*«class HTML», 5767, «class utf8», 150, «class ut16», 302, string, 150, Unicode text, 300*)
  903.  
  904. Textedit
  905. (*«class RTF », 1136, «class utf8», 138, «class ut16», 278, uniform styles, 148, string, 138, scrap styles, 22, Unicode text, 276, uniform styles, 148, scrap styles, 22*)
  906.  
  907. Word
  908. (*«class DSIG», 4, «class DOBJ», 56, «class OBJD», 244, «class RTF », 30573, «class HTML», 21160, scrap styles, 22, uniform styles, 136, string, 210, Unicode text, 420, «class PDF », 13197, picture, 154058, «class EMBS», 33280, «class LNKS», 909, «class LKSD», 244, «class OJLK», 93, «class HLNK», 1387, «class OFSC», 232, «class ut16», 422, «class DSIG», 4, «class DOBJ», 56, «class OBJD», 244, scrap styles, 22, uniform styles, 136, «class EMBS», 33280, «class LNKS», 909, «class LKSD», 244, «class OJLK», 93, «class HLNK», 1387, «class OFSC», 232*)
  909.  
  910. TextWrangler
  911. (*«class utf8», 185, «class BBLM», 4, «class ut16», 372, string, 185, Unicode text, 370, «class BBLM», 4*)
  912.  
  913. *)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement