Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- (*
- This applescript converts clipboard input into a format suited for pasting into an ASC
- reply. I observed that my copies into an ASC reply were not formated that well.
- I observed that copies from a web browser were formated much better. I went about
- adjusting the clipboard copy to the format expected by a web browser for best results.
- This applescript accepts the clipboard in either
- -- plan text upon which the text is converted to HTML. Conversion is limitted to inserting paragraph tags for blank lines and inserting links where http or https text appears. The page title is substituted for the link.
- -- HTML source code identified by text containing HTML markup.
- Caveat emptor.
- to use:
- 1) copy command + c what data you want to convert
- 2) run this applascript by double clicking on the app.
- 3) paste command + V into an ASC reply
- I have tested in Waterfox 56.2.9 in Yosemite. I assume the process will work with other web browsers and other versions of macOS.
- Save as an Application Bundle. Don't check any of the boxes.
- Should you experience a problem, run in the Script Editor.
- Shows how to debug via on run path. Shows items added to folder. Shows log statement.
- It is easier to diagnose problems with debug information. I suggest adding log statements to your script to see what is going on. Here is an example.
- For testing, run in the Script Editor.
- 1) Click on the Event Log tab to see the output from the log statement
- 2) Click on Run
- change log
- may 1, 2019 -- skip 403 forbidding title
- may 2, 2019 -- convert \" to ". the \" mysteriously appears in HTML source code input. Probably some TextEdit artifact.
- copy to TextEdit copy out of TextEdit.
- may 3, 2019 -- regressed may 2nd update. Applescript was inserting \ into output.
- may 8, 2019 -- special processing for html class on clipboard
- enhancements:
- -- get pdf title
- Author: rccharles
- Copyright 2019 rccharles
- Permission is hereby granted, free of charge, to any person obtaining a copy
- of this software and associated documentation files (the "Software"), to deal
- in the Software without restriction, including without limitation the rights
- to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
- copies of the Software, and to permit persons to whom the Software is
- furnished to do so, subject to the following conditions:
- The above copyright notice and this permission notice shall be included in all
- copies or substantial portions of the Software.
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
- OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
- SOFTWARE.
- example text document: remember to edit out the returns.
- set the clipboard to «data HTML3C68746D6C3E3C686561643E3C6D65746120687474702D65717569763D22636
- F6E74656E742D747970652220636F6E74656
- E743D22746578742F68746D6C3B206368617273657
- 43D7574662D38223E3C2F686561643E3C626F64793E3C62723E0A202020203C62207374
- 796C653D22636F6C6F723A677265656E3B2
- 23E506172616C6C656C733C2F623E3A3C62723E0A2
- 0202020467265652076657273696F6E206F6620506172616C6C656C7320666F7220696E6
- 46976696475616C207573653A3C62723E0A
- 68747470733A2F2F6974756E65732E6170706C652E
- 636F6D2F75732F6170702F706172616C6C656C732D6465736B746F702D6C6974652F696
- 4313038353131343730393F6D743D31323C
- 62723E0A2020202046756C6C2076657273696F6E3C6
- 2723E0A202020203C6120687265663D22687474703A2F2F7777772E706172616C6C656C
- 732E636F6D2F656E2F70726F64756374732F
- 6465736B746F702F223E687474703A2F2F7777772E7
- 06172616C6C656C732E636F6D2F656E2F70726F64756374732F6465736B746F702F3C2F6
- 13E3C62723E0A202020203C62723E0A2020
- 20203C623E564D7761726520467573696F6E3C2F62
- 3E3C62723E0A202020205769746820564D7761726520467573696F6E2C2072756E20746
- 865206D6F73742064656D616E64696E6720
- 4D616320616E642057696E646F77730A20202020617
- 0706C69636174696F6E7320730A6964652D62792D73696465206174206D6178696D756
- D2073706565647320776974686F7574207265626F6F74696E673C62723E0A20202020687
- 474703A
- 2F2F7777772E766D776172652E636F6D2F70726F64756374732F667573696F6E2F3C2F62
- 6F64793E3C2F68746D6C3E»
- Translated text is:
- Full version<br>
- <a href="http://www.parallels.com/en/products/desktop/">http://www.parallels.com/en/products/desktop/</a><br>
- <br>
- <b>VMware Fusion</b><br>
- set the clipboard to «data HTML2020202046756C6C2076657273696F6E3C62723E0A202020203C612068726566
- 3D22687474703A2F2F7777772E706172616
- C6C656C732E636F6D2F656E2F70726F64756374732F6465736B746F702F223E687474703
- A2F2F7777772E706172616C6C656C732E63
- 6F6D2F656E2F70726F64756374732F6465736B746F702F3C2F613E3C62723E0A20202020
- 3C62723E0A202020203C623E564D7761726
- 520467573696F6E3C2F623E3C62723E0A»
- set the clipboard to "Saturday, September 7, 2019
- Live streamed
- https://www.omf.ngo/community-symposium-2/"
- set the clipboard to "\"Effective defenses 111 threats\" by John Galt
- https://discussions.apple.com/docs/DOC-8841
- \"Avoid phishing emails 222 and other scams\""
- https://support.apple.com/en-ca/HT204759
- blank lines
- also,see:http://www.google.com/ seeing again:http://www.google.com"
- *)
- (* For whatever reason, this segment doesn't work when moved above.
- set the clipboard to "<p>Simple put, Apple attempts to provide all the malware detection and removal you need in Mac OS X.</p>
- <p>\"Effective defenses against malware and other threats\" by John Galt
- <a href=\"https://discussions.apple.com/docs/DOC-8841\" target=\"_blank\">Effective
- defenses against malware and ot… - Apple Community</a>
- </p><p> </p>"
- *)
- (*
- set the clipboard to "<p>Simple put, Apple attempts to provide all the malware detection and removal you need in Mac OS X.</p>
- <p>\"Effective defenses against malware and other threats\" by John Galt
- <a href=\"https://discussions.apple.com/docs/DOC-8841\" target=\"_blank\">Effective
- defenses against malware and to… - Apple Community</a>
- </p><p> </p>"
- *)
- (*
- set the clipboard to "Saturday, September 7, 2019
- Live streamed
- https://www.omf.ngo/community-symposium-2/"
- *)
- -- Gets invoked here when you run in AppleScript editor or double click on the app icon.
- on run
- global debug
- set debug to 2
- set theList to clipboard info
- printClipboardInfo(theList)
- set cbInfo to get (clipboard info) as string
- -- Most likely, if we have HTML data in the clipboard it will be from a web browser or Word.
- if cbInfo contains "HTML" then
- log "Working with HTML Class data from clipboard."
- set theBoard to the clipboard as «class HTML»
- --log "Print out inputted HTML data on the clipboard..." -- it's just going to be a hex string. waste.
- --log theBoard
- set normalHtml to do shell script "osascript -e 'try' -e 'get the clipboard as «class HTML»' -e 'end try' | awk '{sub(/«data HTML/, \"\") sub(/»/, \"\")} {print}' | xxd -r -p "
- log "...Print out plan text version of inputed HTML data from the clipboard..." & return & normalHtml
- log "printed in hex"
- hexDumpFormat("normalHtml", normalHtml)
- set returnedData to adjustBrowserHTML(normalHtml)
- log "...Print out plan text version of adjusted HTML data ..." & return & returnedData
- log "...just printed plan text version"
- log "printed in hex"
- hexDumpFormat("returnedData", returnedData)
- set returnedData to convertToHTML(returnedData)
- try
- log "returnedData is " & returnedData
- on error errStr number errorNumber
- log "===> We didn't find HTML data. errStr is " & errStr & " errorNumber is " & errorNumber
- return
- end try
- else
- -- will work with a plan html or plan text.
- try
- log "Working with plan html or plan text"
- set clipboardData to (the clipboard as text)
- if debug ≥ 2 then
- log "class clipboardData is " & class of clipboardData
- log "calling printHeader."
- end if
- log "continueing plan html or plan text"
- printHeader("clipboardData", clipboardData)
- on error errStr number errorNumber
- log "===> We didn't find data on the clipboard. errStr is " & errStr & " errorNumber is " & errorNumber
- display dialog "We didn't find HTML source code nor plan text on the clipboard." & return & "Please copy from a different source." giving up after 15
- return 1
- end try
- log "calling common"
- set returnedData to common(clipboardData)
- end if
- log "place on the clipboard returnedData is " & returnedData
- postToCLipboard(returnedData)
- -- return code
- return 0
- end run
- -- Folder actions.
- -- Gets invoked here when something is dropped on the folder that this script is monitoring.
- -- Right click on the folder to be monitored. services > Folder Action Settup...
- on adding folder items to this_folder after receiving added_items
- -- TBD
- end adding folder items to
- -- Gets invoked here when something is dropped on this AppleScript icon
- on open dropped_items
- global debug
- set debug to 2
- (*
- -- Debug code.
- set fileName to choose file with prompt "get file"
- set dropped_items to {fileName}
- *)
- log "class of dropped_items is " & class of dropped_items
- display dialog "You dropped " & (count of dropped_items) & " item or items." & return & " Caveat emptor. You have been warned." giving up after 6
- set totalFileData to ""
- repeat with droppedItem in dropped_items
- log "The droppedItem is "
- -- display dialog "processing file " & (droppedItem as string) giving up after 3
- log droppedItem
- log "class = " & class of droppedItem
- set extIs to findExtension(droppedItem)
- set extIsU to makeCaseUpper(extIs)
- if extIsU is "HTML" or extIsU is "HTM" or extIsU is "TEXT" or extIsU is "TXT" then
- try
- set theFile to droppedItem as string
- set theFile to open for access file theFile
- set allOfFile to read theFile
- close access theFile
- printHeader("read from file ( allOfFile )", allOfFile)
- set totalFileData to totalFileData & common(allOfFile)
- on error theErrorMessage number theErrorNumber
- log theErrorMessage & "error number " & theErrorNumber
- close access theFile
- end try
- else
- -- we do not support this extension
- display dialog "We only support files with extenstion of html, htm, text or txt in either case. Your file had a " & extIs & " extention. Skipping" giving up after 10
- end if
- end repeat
- postToCLipboard(totalFileData)
- -- return code
- return 0
- end open
- -- ------------------------------------------------------
- on common(clipboardData)
- global debug
- set lf to character id 10
- -- Write a message into the event log.
- log " --- Starting on " & ((current date) as string) & " --- "
- set cbInfo to get (clipboard info) as string
- -- don't let Windoze confuse us. convert Return LineFeed to lf
- set clipboardData to alterString(clipboardData, return & lf, lf)
- -- might as will convert classic macOS return to lf. We will have to look for less things.
- set clipboardData to alterString(clipboardData, return, lf)
- -- figure out what type of data we have: plan text or html source code text.
- set paraCount to count of textToList(clipboardData, "<p")
- set endparaCount to count of textToList(clipboardData, "</p>")
- set titleCount to count of textToList(clipboardData, "<title")
- set endTitleCount to count of textToList(clipboardData, "</title>")
- set aLinkCount to count of textToList(clipboardData, "href=\"http")
- -- mangled href="http
- set mangledLinkCount to count of textToList(clipboardData, "href=\\\"http")
- set brCount to count of textToList(clipboardData, "<br>")
- if debug ≥ 1 then
- log "Values used to distinguis HTML source code from plan text."
- log "paraCount is " & paraCount
- log "endparaCount is " & endparaCount
- log "titleCount is " & titleCount
- log "endTitleCount is " & endTitleCount
- log "aLinkCount is " & aLinkCount
- log "brCount is " & brCount
- log "mangledLinkCount is " & mangledLinkCount
- end if
- --set endHttpCount to count of textToList(clipboardData, "http://")
- --set endHttpsCount to count of textToList(clipboardData, "https://")
- -- note, textToList returns a count one greater than the actual because item one is the data before the first found entry.
- if paraCount ≥ 4 and endparaCount ≥ 3 or brCount ≥ 4 or ((titleCount is endTitleCount) and titleCount ≥ 2) or aLinkCount ≥ 3 or mangledLinkCount ≥ 3 then
- log "... found HTML input ... (in plan text format )."
- set clipboardData to adjustURLs(clipboardData)
- set clipboardData to adjustAscHTML(clipboardData)
- set readyData to convertToHTML(clipboardData)
- else
- log "... found plan Text input ..."
- set readyData to typeText(clipboardData)
- set readyData to convertToHTML(readyData)
- end if
- return readyData
- end common
- -- ------------------------------------------------------
- (* add paragraphs *)
- on addParagraphs(theOutputBuffer)
- global debug
- set lf to character id 10
- -- start the theOutputBuffer with a paragraph tag. We are taking a simple approach at this time.
- set theOutputBuffer to "<p>" & theOutputBuffer
- -- LF
- -- Remember CRLF was changed to LF above and CR was chanaged to LF above.
- -- we don't want no Windoze problems
- set theOutputBuffer to alterString(theOutputBuffer, lf & lf, "</p><p> </p><p>")
- -- Does the string end with a dangling paragraph?
- if debug ≥ 3 then
- log "length of theOutputBuffer is " & length of theOutputBuffer
- log "((length of theOutputBuffer) - 2) is " & ((length of theOutputBuffer) - 2)
- log "(length of theOutputBuffer) is " & (length of theOutputBuffer)
- log "((length of theOutputBuffer) - 3) is " & ((length of theOutputBuffer) - 3)
- end if
- if text ((length of theOutputBuffer) - 2) thru (length of theOutputBuffer) of theOutputBuffer is "<p>" then
- set theOutputBuffer to text 1 thru ((length of theOutputBuffer) - 3) of theOutputBuffer
- else if text ((length of theOutputBuffer) - 2) thru (length of theOutputBuffer) of theOutputBuffer is not "</p>" then
- set theOutputBuffer to theOutputBuffer & "</p>"
- end if
- return theOutputBuffer
- end addParagraphs
- -- ------------------------------------------------------
- (*
- We received HTML class data on the clipboard. This is the manager.
- *)
- on adjustBrowserHTML(normalHtml)
- set lf to character id 10
- -- don't let Windoze confuse us. convert Return LineFeed to lf
- set normalHtml to alterString(normalHtml, return & lf, lf)
- -- might as will convert classic macOS return to lf. We will have to look for less things.
- set normalHtml to alterString(normalHtml, return, lf)
- hexDumpFormat("normalHtml", normalHtml)
- set alteredHTML to adjustURLs(normalHtml)
- set alteredHTML to adjustAscHTML(alteredHTML)
- return alteredHTML
- end adjustBrowserHTML
- -- ------------------------------------------------------
- (* ASC likes to insert lots of white space into a page.
- This routing attempt to fix up the html to avoid
- all the extra white-space.
- Minimize the amount of white space inserted.
- *)
- on adjustAscHTML(AscHtml)
- -- surprisingly ASC converts <p> </p> to <p><br></p>, that is a
- -- space only paragraph to a paragraph with a <br> in it.
- -- get rid of the space to avoid this conversion.
- set AscHtml to alterString(AscHtml, "<p> </p>", "<p></p>")
- return AscHtml
- end adjustAscHTML
- -- ------------------------------------------------------
- (*
- example:
- Free version of Parallels for individual use:</p><p><br></p>
- <p>https://itunes.apple.com/us/app/parallels-desktop-lite/id1085114709?mt=12</p>
- <p><br></p>
- <p>Full version</p><p><a href="http://www.parallels.com/en/products/desktop/" target="_blank">
- http://www.parallels.com/en/products/desktop/</a>
- If asc find a URL outside of an a tag, it will place blank lines around the URL. No, it will not go the
- full nine yards and place an a tag around the url.
- *)
- on adjustURLs(theOriginalInputBuffer)
- global debug
- set alteredBuffer to false
- set lf to character id 10
- set theInputBuffer to theOriginalInputBuffer
- hexDumpFormat("theInputBuffer", theInputBuffer)
- -- we end up in a lot of grief when the buffer ends without
- -- a line-end
- if text (length of theInputBuffer) thru (length of theInputBuffer) of theInputBuffer is not lf then
- set alteredBuffer to true
- set theInputBuffer to theInputBuffer & lf
- hexDumpFormat("theInputBuffer", theInputBuffer)
- end if
- set buildHTML to ""
- if debug ≥ 3 then log "buildHTML [ should be empty string ] is " & buildHTML
- set countI to 1 -- variable is used for debuging.
- -- do until we have processed theInputBuffer
- repeat until theInputBuffer is ""
- log "at the top of theInputBuffer ........."
- set foundWhere to {}
- repeat with lookCharacters in {"https://", "http://", "<a "}
- copy (offset of lookCharacters in theInputBuffer) to the end of the foundWhere
- try
- set tempLoc to (offset of lookCharacters in theInputBuffer)
- log "searching for " & lookCharacters & " found at offset " & tempLoc & " contains " & text tempLoc thru (tempLoc + ((length of lookCharacters) - 1)) of theInputBuffer
- end try
- end repeat
- log foundWhere
- set foundMarkerOffset to (minimumPositiveNumber from foundWhere)
- -- figure out what type of marker we got?
- -- None. Reached the end of the data without finding one.
- if foundMarkerOffset ≤ 0 then
- -- we are done
- log "Found all links."
- set buildHTML to buildHTML & theInputBuffer
- printHeader("buildHTML", buildHTML)
- set theInputBuffer to ""
- exit repeat -- ------ done processing theInputBuffer ------>
- end if
- -- find which of three markers we found.
- if (text foundMarkerOffset thru (foundMarkerOffset + 2) of theInputBuffer) is "<a " then
- set actualMarker to "<a "
- else if text foundMarkerOffset thru (foundMarkerOffset + 6) of theInputBuffer is "http://" then
- set actualMarker to "http://"
- else
- -- just assume it's the remaining "https://" since we looked for just three.
- set actualMarker to "https://"
- end if
- set actualMarkerOffsetLength to ((length of actualMarker) - 1)
- log "actualMarker is " & actualMarker & " actualMarkerOffsetLength is " & actualMarkerOffsetLength
- log "foundMarkerOffset is " & foundMarkerOffset & " verify marker text is " & text foundMarkerOffset thru (foundMarkerOffset + actualMarkerOffsetLength) of theInputBuffer
- if foundMarkerOffset ≥ 2 then
- -- collect and strip off characters that are before the marker.
- log "buildHTML is " & buildHTML & " length is " & length of buildHTML
- hexDumpFormat("theInputBuffer", theInputBuffer)
- log " (foundMarkerOffset - 1) is " & (foundMarkerOffset - 1)
- -- get the proceding text
- set buildHTML to buildHTML & text 1 thru (foundMarkerOffset - 1) of theInputBuffer
- log "buildHTML is " & buildHTML
- --printHeader("buildHTML", buildHTML)
- hexDumpFormat("buildHTML", buildHTML)
- -- https://apple.stackexchange.com/a/20135/44531
- set theInputBuffer to text foundMarkerOffset thru -1 of theInputBuffer --trim off character before what we found
- printHeader("theInputBuffer", theInputBuffer)
- hexDumpFormat("theInputBuffer", theInputBuffer)
- else
- log "no proceeding data."
- end if
- repeat 1 times -- interate loop
- -- example" the url is also the display text
- -- <a href="https://discussions.apple.com/docs/DOC-8841" target="_blank">https://discussions.apple.com/docs/DOC-8841</a>
- hexDumpFormat("theInputBuffer", theInputBuffer)
- -- check for the <a> tag
- if text 1 thru (length of "<a ") of theInputBuffer is "<a " then
- -- found <a> tag
- log "processing <a> tag"
- -- ASC consider a line-end as a <br> when when firefox considers it a blank
- -- change a possible line-end before an <a> tag to a " "
- if debug ≥ 1 then hexDumpFormat("before lf check buildHTML", buildHTML)
- if text (length of buildHTML) thru (length of buildHTML) of buildHTML is lf then
- log "we need to delete a line-end before the <a> tag"
- set buildHTML to text 1 thru ((length of buildHTML) - 1) of buildHTML
- set buildHTML to buildHTML & " "
- if debug ≥ 1 then hexDumpFormat("after lf deletion buildHTML", buildHTML)
- end if
- -- find ending </a> tag
- set whereEnds to offset of "</a>" in theInputBuffer
- if whereEnds ≤ 0 then
- log "==> found an error in the HTML. no ending </a>"
- set buildHTML to buildHTML & theInputBuffer
- printHeader("buildHTML", buildHTML)
- set theInputBuffer to ""
- exit repeat -- ------ next ------>
- end if
- set lastOffsetLength to ((length of "</a>") - 1)
- log "lastOffsetLength is " & lastOffsetLength
- set lastCharacterOffset to whereEnds + lastOffsetLength
- log "lastCharacterOffset is " & lastCharacterOffset
- -- needs to copy the ending ">"
- set anchorString to text 1 thru lastCharacterOffset of theInputBuffer
- -- don't let Windoze confuse us. convert Return LineFeed to lf
- -- Correct absure ASC bug where there is a line-end in the <a> text.
- hexDumpFormat("before adjusting anchorString", anchorString)
- set anchorString to alterString(anchorString, lf, " ")
- hexDumpFormat("anchorString", anchorString)
- set buildHTML to buildHTML & anchorString
- hexDumpFormat("buildHTML", buildHTML)
- -- https://apple.stackexchange.com/a/20135/44531
- -- We want first character of the "next" portion of theInputBuffer so add one
- set theInputBuffer to text (lastCharacterOffset + 1) thru -1 of theInputBuffer --trim out <a>
- hexDumpFormat("theInputBuffer", theInputBuffer)
- -- Web Browsers like Firefox convert a line-end in text to a space.
- if text 1 thru 1 of theInputBuffer is lf then
- if (length of theInputBuffer) is 1 then
- set theInputBuffer to " "
- else
- set theInputBuffer to " " & (text 2 thru (length of theInputBuffer) of theInputBuffer)
- if debug ≥ 1 then hexDumpFormat("after lf deletion; theInputBuffer", theInputBuffer)
- end if
- end if
- exit repeat -- ------ next ------>
- end if
- -- find the end of the HTML URL by splitting on blank or return
- -- unsafe characters <blank> " < > # % { } | \ ^ ~ [ ] `
- -- and line-end
- -- https://perishablepress.com/stop-using-unsafe-characters-in-urls/
- -- the end of the clipboard string my end after the url, hence no " ", LF or CR
- -- Rember, CRLF was converted to LF above
- set endsWhere to {}
- -- the end of the url ends with one of the not allowed characters + line-end
- repeat with unsafeCharacter in {" ", "\"", lf, "<", ">", "#", "%", "{", "}", "|", "\\", "^", "~", "[", "]"}
- copy (offset of unsafeCharacter in theInputBuffer) to the end of the endsWhere
- end repeat
- log endsWhere
- set endOfURL to (minimumPositiveNumber from endsWhere) - 1
- log "endOfURL is " & endOfURL
- if endOfURL ≤ 0 then
- -- We have reached the end of the input
- set theURL to theInputBuffer
- set theInputBuffer to ""
- else
- set theURL to text 1 thru endOfURL of theInputBuffer
- log "from middle theURL is " & theURL
- set theInputBuffer to text (endOfURL + 1) thru -1 of theInputBuffer -- trim off url in front.
- end if
- printHeader("printHeader", theInputBuffer)
- log "----------------------- " & theURL & " -----------------------"
- (*
- retrieve the file pointed to by the URL so we can
- get the title. Note: <title> can have attributes. Example:
- <title data-test-page-title="Parallels Desktop Lite on the Mac App Store"
- >Parallels Desktop Lite on the Mac App Store</title>
- *)
- -- Example:
- -- curl --silent --location --max-time 10 <URL>
- set toUnix to "curl --silent --location --max-time 10 " & quoted form of theURL
- log "what we will use to retrieve the Url. toUnix is " & return & " " & toUnix
- try
- log "reading link file to get title"
- set fromUnix to do shell script toUnix
- --log "fromUnix:"
- printHeader("fromUnix", fromUnix)
- -- may not be working with an HTLM document, so thefound title may be too long or confused.
- log "how far?..."
- -- there could be some bagage with the <title
- set actualTagData to tagContent(fromUnix, "<title", "</title>")
- -- Find what we will actually display in the title.
- -- Fix up gotchas.
- log "actualTagData is " & printHeader("actualTagData", actualTagData)
- if actualTagData is "" then
- set actualTagData to theURL
- else if length of actualTagData > 140 then
- log "length of actualTagData is " & length of actualTagData & "which is too long. Truncated."
- set actualTagData to theURL
- -- curl https://appleid.apple.com returns <title>403 Forbidden</title>
- -- which is misleading.
- else if actualTagData contains "403" and actualTagData contains "Forbidden" then
- set actualTagData to theURL
- else
- -- there could be some attributes within the <title> tag.
- -- or there could not be
- -- an attribute could have a > in it. ignoring that for now.
- try
- -- find where <title ends
- set whereToEnd to (offset of ">" in actualTagData)
- log "whereToEnd is " & whereToEnd
- set whereToBegin to whereToEnd + (length of ">")
- log "whereToBegin is " & whereToBegin
- hexDumpFormat("actualTagData", actualTagData)
- set actualTagData to text whereToBegin thru (length of actualTagData) of actualTagData
- log "actualTagData is " & actualTagData
- on error theErrorMessage number theErrorNumber
- log "==>No ending greater than (>) for title. Badly contructed html." & return & "message is " & theErrorMessage & "error number " & theErrorNumber
- set actualTagData to actualTagData
- -- no need to repair. It's not our page.
- end try
- -- found line-end in title. caused confustion.
- -- note: this is new data and the multiple line-ends have not been
- -- filtered out.
- -- some joker had a line-end in the title!
- set actualTagData to alterString(actualTagData, return & lf, " ")
- set actualTagData to alterString(actualTagData, return, " ")
- set actualTagData to alterString(actualTagData, lf, " ")
- log "actualTagData has been chanaged which is " & actualTagData
- hexDumpFormat("actualTagData", actualTagData)
- end if
- on error errMsg number n
- log "==> Error occured when looking for title. " & errMsg & " with number " & n
- set actualTagData to theURL
- end try
- -- why the _blank in the <a>?
- set assembled to "<a href=\"" & theURL & "\" target=\"_blank\">" & actualTagData & "</a>"
- log "assembled is " & assembled
- if (length of theInputBuffer) ≤ 0 then
- -- We have reached the end of the input
- log "we have reached the end of the input."
- set buildHTML to buildHTML & assembled
- else
- log "more input to process"
- set buildHTML to buildHTML & assembled
- end if
- -- wrap up
- --log "transformed text from buildHTML is " & return & buildHTML
- log "#" & countI & " transformed text from buildHTML is " & return & buildHTML
- -- number of links found
- set countI to countI + 1
- end repeat -- used to interate
- end repeat -- processing links in the input text
- if alteredBuffer is true then
- -- chop off the lf we added above.
- set buildHTML to text 1 thru ((length of buildHTML) - 1) of buildHTML
- set alteredBuffer to false -- somewhat redundant
- end if
- return the buildHTML
- end adjustURLs
- -- ------------------------------------------------------
- (*
- alterString
- thisText is the input string to change
- delim is what string to change. It doesn't have to be a single character.
- replacement is the new string
- returns the changed string.
- *)
- on alterString(thisText, delim, replacement)
- set resultList to {}
- set {tid, my text item delimiters} to {my text item delimiters, delim}
- try
- set resultList to every text item of thisText
- set text item delimiters to replacement
- set resultString to resultList as string
- set my text item delimiters to tid
- on error
- set my text item delimiters to tid
- end try
- return resultString
- end alterString
- -- ------------------------------------------------------
- (*
- Return the text to the right of theToken.
- *)
- on answerAndChomp(theString, theToken)
- set debugging to false
- set theOffset to offset of theToken in theString
- if debugging then log "theOffset is " & theOffset
- set theLength to length of theString
- if theOffset > 0 then
- set beginningPart to text 1 thru (theOffset - 1) of theString
- if debugging then log "beginningPart is " & beginningPart
- set chompped to text theOffset thru theLength of theString
- if debugging then log "chompped is " & chompped
- return {chompped, beginningPart}
- else
- set beginningPart to ""
- return {theString, beginningPart}
- end if
- end answerAndChomp
- -- ------------------------------------------------------
- (*
- Delete the leading part of the string until and including theToken.
- *)
- on chompLeftAndTag(theString, theToken)
- set debugging to false
- --log text 1 thru ((offset of "my" in s) - 1) of s
- --set rightString to offset of theToken in theString thru count of theString of theString
- set theOffset to offset of theToken in theString
- if debugging then log "theOffset is " & theOffset
- set theLength to length of theString
- if debugging then log "theLength is " & theLength
- if theOffset > 0 then
- set chompped to text (theOffset + (length of theToken)) thru theLength of theString
- if debugging then log "chompped is " & chompped
- return chompped
- else
- return ""
- end if
- end chompLeftAndTag
- -- ------------------------------------------------------
- (*
- Yvan Koenig
- https://macscripter.net/viewtopic.php?id=43133
- *)
- on findExtension(inputFileName)
- set fileName to inputFileName as string
- set saveTID to AppleScript's text item delimiters
- set AppleScript's text item delimiters to {"."}
- set theExt to last text item of fileName
- set AppleScript's text item delimiters to saveTID
- --log "theExt is " & theExt
- if theExt ends with ":" then set theExt to text 1 thru -2 of theExt
- --log "theExt is " & theExt
- return theExt
- end findExtension
- -- ------------------------------------------------------
- (*
- http://krypted.com/mac-os-x/to-hex-and-back/
- 0000000: 3c68 746d 6c3e 3c68 6561 643e 3c6d 6574 <html><head><met
- " 0 2 4 6 8 a c e 0 2 4 6 8 a c e"
- *)
- on hexDumpFormat(textMessage, hex)
- global debug
- if debug ≥ 3 then log "in hexDumpFormat"
- if debug ≥ 3 then log "hex string is " & hex
- -- -r -p
- set toUnix to "/bin/echo -n " & (quoted form of hex) & " | xxd "
- if debug ≥ 3 then log "toUnix is " & toUnix
- try
- set fromUnix to do shell script toUnix
- log "variable " & textMessage & " in hex is " & return & " 0 2 4 6 8 a c e 0 2 4 6 8 a c e" & return & fromUnix
- on error errMsg number n
- log "==> convert hex string to string failed. " & errMsg & " with number " & n
- end try
- end hexDumpFormat
- -- ------------------------------------------------------
- (*
- https://stackoverflow.com/questions/55838252/minimum-value-that-not-zero
- set m to get minimumPositiveNumber from {10, 2, 0, 2, 4}
- log "m is " & m
- set m to minimumPositiveNumber from {0, 0, 0}
- log "m is " & m
- *)
- on minimumPositiveNumber from L
- local L
- if L = {} then return null
- set |ξ| to 0
- repeat with x in L
- set x to x's contents
- if (x < |ξ| and x ≠ 0) ¬
- or |ξ| = 0 then ¬
- set |ξ| to x
- end repeat
- |ξ|
- end minimumPositiveNumber
- -- ------------------------------------------------------
- (*
- makeCaseUpper("Now is the time, perhaps, for all good men")
- *)
- on makeCaseUpper(theString)
- set UC to "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
- set LC to "abcdefghijklmnopqrstuvwxyz"
- set C to characters of theString
- repeat with ch in C
- if ch is in LC then set contents of ch to item (offset of ch in LC) of UC
- end repeat
- return C as string
- end makeCaseUpper
- -- ------------------------------------------------------
- on postToCLipboard(pleasePost)
- try
- -- osascript -e "set the clipboard to «data HTML${hex}»"
- set toUnixSet to "osascript -e \"set the clipboard to «data HTML" & pleasePost & "»\""
- log "toUnixSet is " & printHeader("toUnixSet", toUnixSet)
- set fromUnixSet to do shell script toUnixSet
- log "fromUnixSet is " & fromUnixSet
- on error errMsg number n
- log "==> We tried to send back HTML data, but failed. " & errMsg & " with number " & n
- end try
- -- see what ended up on the clipboard
- set theList2 to clipboard info
- printClipboardInfo(theList2)
- end postToCLipboard
- -- ------------------------------------------------------
- on printClipboardInfo(theList)
- log (clipboard info)
- log class of theList
- log "Data types on the clipboard ... "
- printList("", theList)
- log "... "
- end printClipboardInfo
- -- ------------------------------------------------------
- (* Pump out the beginning of theString *)
- on printHeader(theName, theString)
- global debug
- if debug ≥ 3 then
- log "in printHeader"
- log theString
- log length of theString
- end if
- if length of theString ≤ 0 then
- log "==> no string to print"
- else
- log theName & " is " & text 1 thru (minimumPositiveNumber from {400, length of theString}) of theString & "<+++++++++"
- end if
- end printHeader
- -- ------------------------------------------------------
- (*
- print out the items in a list
- *)
- on printList(theName, splits)
- try
- set theCount to 1
- repeat with theEntry in splits
- --log "class of theEntry is " & class of theEntry
- set classDisplay to class of theEntry as text
- --log "classDisplay is " & classDisplay as text
- --log "class of classDisplay is " & class of classDisplay
- if classDisplay is "list" then
- log " " & theName & theCount & " is " & item 1 of theEntry & "; " & item 2 of theEntry
- else
- log " " & theName & theCount & " is " & theEntry
- end if
- set theCount to theCount + 1
- end repeat
- on error errMsg number n
- log "==> No go in printList. " & errMsg & " with number " & n
- end try
- end printList
- -- ------------------------------------------------------
- (*
- splitTextToList seems to be what you are trying to do
- thisText is the input string
- delim is what to split on
- results returned in a list
- Total hack. We know splitTextToList strips of delim so add it back.
- *)
- on splitTextToList(thisText, delim)
- set returnedList to textToList(thisText, delim)
- set resultArray to {}
- copy item 1 of returnedList to the end of the resultArray
- repeat with i from 2 to (count of returnedList) in returnedList
- set newElement to delim & item i of returnedList
- copy newElement to the end of the resultArray
- end repeat
- return resultArray
- end splitTextToList
- -- ------------------------------------------------------
- (*
- Retrieved data between "begin" and "end" tag. Whatever is between the strings.
- *)
- on tagContent(theString, startTag, endTag)
- global debug
- try
- log "in tabContent. " & return & " startTag is " & startTag & " endTag is " & endTag
- set beginningOfTag to chompLeftAndTag(theString, startTag)
- if length of beginningOfTag ≤ 0 then
- set middleText to ""
- else
- printHeader("beginningOfTag", beginningOfTag)
- set endingOffset to (offset of endTag in beginningOfTag)
- if endingOffset ≤ (length of endTag) then
- set middleText to ""
- else
- set middleText to text 1 thru (endingOffset - 1) of beginningOfTag
- printHeader("middleText is ", middleText)
- end if
- end if
- on error errMsg number n
- log "finding contained text failed. " & errMsg & " with number " & n
- set middleText to ""
- end try
- if debug ≥ 2 then log "returning with middleText is " & middleText
- return middleText
- end tagContent
- (*
- textToList seems to be what you are trying to do
- thisText is the input string
- delim is what to split on
- returns a list of strings.
- - textToList was found here:
- - http://macscripter.net/viewtopic.php?id=15423
- *)
- on textToList(thisText, delim)
- set resultList to {}
- set {tid, my text item delimiters} to {my text item delimiters, delim}
- try
- set resultList to every text item of thisText
- set my text item delimiters to tid
- on error
- set my text item delimiters to tid
- end try
- return resultList
- end textToList
- -- ------------------------------------------------------
- on convertToHTML(theData)
- global debug
- log "in convertToHTML" & return & " Try to send back HTML."
- try
- set clipboardDataQuoted to quoted form of theData
- printHeader("clipboardDataQuoted", clipboardDataQuoted)
- hexDumpFormat("clipboardDataQuoted", clipboardDataQuoted)
- -- make hex string as required for HTML data on the clipboard
- set toUnix to "/bin/echo -n " & clipboardDataQuoted & " | hexdump -ve '1/1 \"%.2x\"'"
- printHeader("toUnix to convert to hex", toUnix)
- set fromUnix to do shell script toUnix
- printHeader("fromUnix", fromUnix)
- if debug ≥ 2 then
- log "displaying original string --- so we can tell if it converted successfully. "
- --hexDumpFormat("fromUnix", fromUnix)
- end if
- on error errMsg number n
- log "==> convert to hex string failed. " & errMsg & " with number " & n
- set fromUnix to ""
- end try
- return fromUnix
- end convertToHTML
- -- ------------------------------------------------------
- on typeText(theData)
- (*
- Unix-like systems LF 0A \n
- (Linux, macOS)
- Microsoft Windows CRLF 0D 0A \r\n
- classic Mac OS CR 0D \r Applescript return
- *)
- global debug
- set lf to character id 10
- log "in typeText"
- printHeader("the input ( theData )", theData)
- -- Example: -- https://discussions.apple.com/docs/DOC-8841
- -- locate links
- set theOutputBuffer to adjustURLs(theData)
- -- add paragraphs
- set theOutputBuffer to addParagraphs(theOutputBuffer)
- log "theOutputBuffer is " & return & theOutputBuffer
- return theOutputBuffer
- end typeText
- (*
- https://www.oreilly.com/library/view/applescript-the-definitive/0596102119/re89.html
- https://stackoverflow.com/questions/11085654/apple-script-how-can-i-copy-html-content-to-the-clipboard
- -- user has copied a file's icon in the Finder
- clipboard info
- -- {{string, 20}, {«class ut16», 44}, {«class hfs », 80}, {«class
- utf8», 20}, {Unicode text, 42}, {picture, 2616}, {«class icns», 43336},
- {«class furl», 62}}
- textutil -convert html foo.rtf
- if ((clipboard info) as string) contains "«class furl»" then
- log "the clipboard contains a file named " & (the clipboard as string)
- else
- log "the clipboard does not contain a file"
- end if
- the clipboard required
- as class optional
- tell application "Script Editor"
- activate
- end tell
- textutil has a simplistic text to html conversion
- set clipboardDataQuoted to quoted form of theData
- log "quoted form is " & clipboardDataQuoted
- set toUnix to "/bin/echo -n " & clipboardDataQuoted
- set toUnix to toUnix & " | textutil -convert html -noload -nostore -stdin -stdout "
- log "toUnix is " & toUnix
- set fromUnix to do shell script toUnix
- log "fromUnix is " & fromUnix
- set s to "Today is my birthday"
- log text 1 thru ((offset of "my" in s) - 1) of s
- --> "Today is "
- -- text 1 thru ((offset of "my" in s) - 1) of s
- -- -1 since offset return the first character "m" position count
- log "beginningOfTag is " & text 1 thru (minimumPositiveNumber from {200, length of beginningOfTag}) of beginningOfTag & "<+++++++++++++++++++++++"
- https://developer.apple.com/library/archive/documentation/AppleScript/Conceptual/AppleScriptLangGuide/reference/ASLR_cmds.html
- *)
- --mac $ hex=`echo -n "<p>your html code here</>" | hexdump -ve '1/1 "%.2x"'`
- --mac $ echo $hex
- --3c703e796f75722068746d6c20636f646520686572653c2f3e
- --mac $ osascript -e "set the clipboard to «data HTML${hex}»"
- --mac $
- (*
- A sub-routine for encoding ASCII characters.
- encode_char("$")
- --> returns: "%24"
- based on:
- https://www.macosxautomation.com/applescript/sbrt/sbrt-08.html
- *)
- (*
- Lowest Numeric Value in a List
- This sub-routine will return the lowest numeric value in a list of items. The passed list can contain non-numeric data as well as lists within lists. For example:
- lowest_number({-3.25, 23, 2345, "sid", 3, 67})
- --> returns: -3.25
- lowest_number({-3.25, 23, {-22, 78695, "bob"}, 2345, true, "sid", 3, 67})
- --> returns: -22
- If there is no numeric data in the passed list, the sub-routine will return a null string ("")
- lowest_number({"this", "list", "contains", "only", "text"})
- --> returns: ""
- https://macosxautomation.com/applescript/sbrt/sbrt-03.html
- Here's the sub-routine:
- *)
- (*
- on lowestNumber(values_list)
- set the low_amount to ""
- repeat with i from 1 to the count of the values_list
- set this_item to item i of the values_list
- set the item_class to the class of this_item
- if the item_class is in {integer, real} then
- if the low_amount is "" then
- set the low_amount to this_item
- else if this_item is less than the low_amount then
- set the low_amount to item i of the values_list
- end if
- else if the item_class is list then
- set the low_value to lowest_number(this_item)
- if the the low_value is less than the low_amount then ¬
- set the low_amount to the low_value
- end if
- end repeat
- return the low_amount
- end lowestNumber
- https://lists.apple.com/archives/applescript-users/2010/Sep/msg00139.html
- set list_of_values to {10, 20, 30, 40, 50, 60, 2000, 9, 3000, 4}
- set minimum to 9.9999999999E+12
- set maximum to 0
- repeat with ref_to_value in list_of_values
- set the_value to contents of ref_to_value
- if the_value > maximum then set maximum to the_value
- if the_value < minimum then set minimum to the_value
- end repeat
- {minimum, maximum}
- may do the trick.
- Yvan KOENIG (VALLAURIS, France) lundi 13 septembre 2010 22:32:41
- *)
- (* https://lists.apple.com/archives/applescript-users/2010/Sep/msg00139.html
- set list_of_values to {10, 20, 30, 40, 50, 60, 2000, 9, 3000, 4}
- set minimum to 9.9999999999E+12
- assume it's limited to positive values
- on maxValue(list_of_values)
- global debug
- if debug ≥ 3 then log "in maxValue " & return & list_of_values
- set maximum to 0
- repeat with ref_to_value in list_of_values
- set the_value to contents of ref_to_value
- if the_value > maximum then set maximum to the_value
- end repeat
- if debug ≥ 3 then log maximum
- return maximum
- end maxValue
- *)
- -- ------------------------------------------------------
- (*
- http://harvey.nu/applescript_url_encode_routine.html
- on urlencode(theText)
- set theTextEnc to ""
- repeat with eachChar in characters of theText
- set useChar to eachChar
- set eachCharNum to ASCII number of eachChar
- if eachCharNum = 32 then
- set useChar to "+"
- else if (eachCharNum ≠ 42) and (eachCharNum ≠ 95) and (eachCharNum < 45 or eachCharNum > 46) and (eachCharNum < 48 or eachCharNum > 57) and (eachCharNum < 65 or eachCharNum > 90) and (eachCharNum < 97 or eachCharNum > 122) then
- set firstDig to round (eachCharNum / 16) rounding down
- set secondDig to eachCharNum mod 16
- if firstDig > 9 then
- set aNum to firstDig + 55
- set firstDig to ASCII character aNum
- end if
- if secondDig > 9 then
- set aNum to secondDig + 55
- set secondDig to ASCII character aNum
- end if
- set numHex to ("%" & (firstDig as string) & (secondDig as string)) as string
- set useChar to numHex
- end if
- set theTextEnc to theTextEnc & useChar as string
- end repeat
- return theTextEnc
- end urlencode
- Clipboard classes after a copy from the application.
- from waterfox
- (*«class HTML», 13876, «class utf8», 505, «class ut16», 1012, string, 505, Unicode text, 1010*)
- from chrome
- (*«class HTML», 748, «class utf8», 204, «class ut16», 410, string, 204, Unicode text, 408*)
- from safari
- (*«class weba», 120785, «class RTF », 70255, «class HTML», 122811, «class utf8», 3370, «class ut16», 6772, uniform styles, 47132, string, 3385, scrap styles, 8122, Unicode text, 6732, uniform styles, 47132, scrap styles, 8122*)
- iCab
- (*«class weba», 1665, «class RTF », 763, «class utf8», 121, «class ut16», 244, uniform styles, 376, string, 121, scrap styles, 62, Unicode text, 242, uniform styles, 376, scrap styles, 62*)
- Opera
- (*«class HTML», 5767, «class utf8», 150, «class ut16», 302, string, 150, Unicode text, 300*)
- Textedit
- (*«class RTF », 1136, «class utf8», 138, «class ut16», 278, uniform styles, 148, string, 138, scrap styles, 22, Unicode text, 276, uniform styles, 148, scrap styles, 22*)
- Word
- (*«class DSIG», 4, «class DOBJ», 56, «class OBJD», 244, «class RTF », 30573, «class HTML», 21160, scrap styles, 22, uniform styles, 136, string, 210, Unicode text, 420, «class PDF », 13197, picture, 154058, «class EMBS», 33280, «class LNKS», 909, «class LKSD», 244, «class OJLK», 93, «class HLNK», 1387, «class OFSC», 232, «class ut16», 422, «class DSIG», 4, «class DOBJ», 56, «class OBJD», 244, scrap styles, 22, uniform styles, 136, «class EMBS», 33280, «class LNKS», 909, «class LKSD», 244, «class OJLK», 93, «class HLNK», 1387, «class OFSC», 232*)
- TextWrangler
- (*«class utf8», 185, «class BBLM», 4, «class ut16», 372, string, 185, Unicode text, 370, «class BBLM», 4*)
- *)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement